czbiohub-sf / tabula-muris-senis

Tabula Muris Senis
http://tabula-muris-senis.ds.czbiohub.org
BSD 3-Clause "New" or "Revised" License
93 stars 26 forks source link

Metadata labels #6

Closed sbrn3 closed 4 years ago

sbrn3 commented 4 years ago

Hi,

I am importing the .5had files into R using Seurat and all the metadata information e.g. cell.ontology.class is stored as numbers (0 1 2 3) for each factor rather than their names. Is there a way that I can get the name information?

aopisco commented 4 years ago

@sbrn3 did you get a chance to try using the new objects (updated after revision) to our public AWS bucket: https://s3.console.aws.amazon.com/s3/buckets/czb-tabula-muris-senis/Data-objects/?region=us-west-2

CyrilLagger commented 4 years ago

Hi,

@sbrn3 : I have the same issue for all the .h5ad files I tried so far (Tabula Muris and other groups). There is actually a post on that on the Seurat github here. But no solution proposed yet. For now, I just have a python script that I load in R with reticulate and takes care of collecting the missing names.

By the way, also note that if you use ReadH5AD() directly on the Tabula Muris Senis files, you might lose some information during the conversion (notably the dimensional reduction X_pca, etc). This was my case for the TMS files on figshare (not checked on AWS yet). This is discussed here and happens because Seurat assumes that there is no dimensional reduction if the X matrix in the h5ad file is sparse.

aopisco commented 4 years ago

@CyrilLagger @sbrn3 which tissue would you like to analyse with Seurat? I can try on my end to make the conversion to preserve as much info as needed

sbrn3 commented 4 years ago

Hi, thanks for the replies. @aopisco I have found a more manual workaround in python similar to what @CyrilLagger mentioned. I guess in my case I only need muscle and lung so it isn't that big of a deal to copy the data over.