Run CPA on Kang et al dataset without dosage information

znavidi commented 1 year ago

Hello, Thank you for your contribution to the field! I'm interested in training and testing CPA on the Kang et al. (PBMC) dataset. As far as I understand, this dataset doesn't include dosage information, and the data file isn't provided on the website either. Using the PBMC file provided on the scGen website, I was wondering if you could guide me on how to run the following line without the 'dosage' information, as it throws a KeyError: 'dosage' when I remove the 'dosage': 'dose_val' line.

cpa.CPA.setup_anndata(adata, perturbation_keys={ 'perturbation': 'condition', 'dosage': 'dose_val', }, categorical_covariate_keys=['cell_type'], control_key='control', )

Thanks in advance!

Naghipourfar commented 10 months ago

Hi @znavidi,

So sorry for the late response. We've just added a demo notebook to train CPA on the Kang et al dataset. Please see here.

Thanks

znavidi commented 9 months ago

Thanks for your update! When I run the tutorial, I get the following error, which seems to be from the data file. Would you please help with that?

Thanks!

ArianAmani commented 9 months ago

Hi @znavidi The latest version here will be working properly. Until this version gets released on pypi, please change line 3 in the first code block in the notebook from branch = "stable" to branch = "not-stable" so it will install the latest version directly from github.

Thanks

theislab / cpa

Run CPA on Kang et al dataset without dosage information #16