calico / scnym

Semi-supervised adversarial neural networks for classification of single cell transcriptomics data
https://scnym.research.calicolabs.com
Apache License 2.0
73 stars 12 forks source link

Human atlas not available in notebook? #11

Closed Gibbsdavidl closed 3 years ago

Gibbsdavidl commented 3 years ago

Hi there,

Could the scnym_atlas_transfer notebook in the README be updated to include the human atlas?

Thanks!

ValueError Traceback (most recent call last)

in () 5 if ATLAS2USE not in CELL_ATLASES.keys(): 6 msg = f'{ATLAS2USE} is not available in the cell atlas directory.' ----> 7 raise ValueError(msg) ValueError: human is not available in the cell atlas directory.
jacobkimmel commented 3 years ago

Hi David,

Thanks for your interest!

We removed the human atlas after several users had poor experiences with the data. The annotations in the HCL appear to contain some errors, and we haven’t worked directly with the data ourselves. For human data, I’d suggest using a tissue specific dataset from the same tissue as your sample as a reference. Happy to help point you to a relevant set if you don’t know of one.

All the best, Jacob

Gibbsdavidl commented 3 years ago

Thanks for the feedback! Any thoughts on training data for esophagus and stomach?

Possibly this: https://pubmed.ncbi.nlm.nih.gov/31892341/

Thanks much! -dave

jacobkimmel commented 3 years ago

Sadly I haven't seen great human data for GI tissues yet. The best esophagus data I've seen is in: https://www.nature.com/articles/s41467-018-06796-9

For stomach, the HCL may still be your best bet. A link to our pre-processed version is here. I'd subset to the GI tissues before training and double check labels against some known biology too.

Happy to help if you run into issues.

Gibbsdavidl commented 3 years ago

Great, thanks! Consider this issue closed!