theislab / sfaira

data and model repository for single-cell data
https://sfaira.readthedocs.io
BSD 3-Clause "New" or "Revised" License
134 stars 11 forks source link

10.1038/s41593-021-01005-1 #573

Open nikofleischer opened 2 years ago

nikofleischer commented 2 years ago

Is your dataset not yet in sfaira? No

Describe the data set Fill in below:

Link to publication: https://www.nature.com/articles/s41593-021-01005-1 Title of publication: Disentangling glial diversity in peripheral nerves at single-nuclei resolution. First author of the publication: Aldrin K. Y. Yim DOI preprint: DOI journal: 10.1038/s41593-021-01005-1 Download link to any data objects required:

Describe metadata Optionally, you can already collect metadata information here before starting to write the data loader or to help others write one. You can extend these lists if you find more metadata that you want to record before writing a data loader. Note that you can also directly put this information into a draft data loader and start a pull request instead of first writing an issue. If you know this dataset well but cannot write a loader right now, this will help other people decide if they want to write a loader for this dataset and will speed up their process.

Is this primary data (not a meta-study): Yes Is most raw gene expression matrix normalized (if yes how)?: No Single-cell assay used: 10x Disease(s) of sampled individuals: none Organ(s) sampled (ideally UBERON term): 0001322 (sciatic nerve) 0001759 (vagus nerve) 0015488 (sural nerve) Organism(s) sampled (ideally NCBItaxon term): 10090 (Mus musculus) Any relevant cell-wise annotation fields that are column names of a table or column names in .obs of an h5ad for example:

Cell type annotation: Todo Additional context Add any other context of the dataset or anticipated issues with the data loader.

Single nuclei atlas of the peripheral Nerves in M. musculus. Single nuclei obtained by ActB-Cre/MPZ-Cre x Sun1-GFP FACS Libraries: 10X Genomics v2 (ActB)/v3 (MPZ) Sequencing: Illumina HiSeq 2500 or NovaSeq 6000

Caveat: this is a tissue atlas constructed from several datasets, I don't know if these should be treated as individual entries or included in their joint form. Sub datasets are:

Immune cells (CD45+ FACS sorted single cells) Schwann cell-specific (MPZ-Cre x Sun1-GFP FACS sorted single nuclei) Sciatic nerve (ActB-Cre x Sun1-GFP FACS sorted single nuclei) Peripheral nerve (ActB-Cre x Sun1-GFP FACS sorted single nuclei of Sciatic + Sural + Vagus nerve)

nikofleischer commented 2 years ago

I would like to contribute this dataset or a part of it. Should I load the integrated atlas or a part of it without integration? All data is provided as Seurat objects, is it fine to convert them using SeuratDisk?

Sallysue-22 commented 2 years ago

how to join this dataset/ workshop? Thanks in advance

davidsebfischer commented 2 years ago

I would like to contribute this dataset or a part of it. Should I load the integrated atlas or a part of it without integration? All data is provided as Seurat objects, is it fine to convert them using SeuratDisk?

We will go over this on day 2 of the workshop, you will most likely be able to handle this as a multi-file loader for each of these files:

You can use anndata2ri in python to load these Rds objects, we will provide a Docker where this environment is already set up.

le-ander commented 2 years ago

I would like to contribute this dataset or a part of it. Should I load the integrated atlas or a part of it without integration? All data is provided as Seurat objects, is it fine to convert them using SeuratDisk?

Hey @sparsepix Just to add to davids response above: the aim of the dataloader is to read the unmodified file that is provided in the download link. If they are Seurat objects you can directly read them from the sfaira dataloader using rpy2 and anndata2ri like so: https://sfaira.readthedocs.io/en/latest/adding_datasets.html#reading-r-files.

davidsebfischer commented 2 years ago

how to join this dataset/ workshop? Thanks in advance

Hi @Sallysue-22, there is already somebody assigned to this dataset, feel free to choose a different one on which only you work which will most likely provide you with the best learning experience, or work on this in duplication if you specifically want to work on this dataset.