theislab / chemCPA

Code for "Predicting Cellular Responses to Novel Drug Perturbations at a Single-Cell Resolution", NeurIPS 2022.
https://arxiv.org/abs/2204.13545
MIT License
104 stars 24 forks source link

where is trapnell_final_V7.h5ad? #128

Closed bhomass closed 1 year ago

bhomass commented 1 year ago

Hi I double checked every download option and couldn't locate trapnell_final_V7.h5ad, which is required to move forward past sciplex_SMILES.ipynb. Based on the code, this would be under the project_folder/datasets directory, which is not checked in in any of the branches..

I see the other trapnell files under embeddings, but they are different files. I downloaded a sce_trapnell_full.rds from github, but the content does look much like the lincs data.

Please give a hint where to locate this file. Curiously, this trapnell data isn't mentioned anywhere in the paper. It does seemed to be used interchangeably with the name sci-Plex.

bhomass commented 1 year ago

I found many other missing h5ad files. Some I maybe able to generate if I get trapnell_final_V7.h5ad. Others, I found no clue how to generate.

adata_baseline.h5ad in baseline_experiment.yaml

adata_baseline_high_dose.h5ad in baseline_experiment_highest_dose.yaml

adata_fold.h5ad in fold_experiment.yaml

adata_fold_high_dose.h5ad in fold_experiment_highest_dose.yaml

All of which are expected to reside in project_folder/datasets/

Would it be possible to ask the team to make the entire datasets folder available online? Much appreciated it.

bhomass commented 1 year ago

yes, trapnell is simply sciplex. plus a few other adjustments needed to get the file names in the dataset to match the filenames used in the code. The filenames drifted over time, and the checked in code is out of sync.