theislab / chemCPA

Code for "Predicting Cellular Responses to Novel Drug Perturbations at a Single-Cell Resolution", NeurIPS 2022.
https://arxiv.org/abs/2204.13545
MIT License
104 stars 24 forks source link
disentanglement drug-discovery genomics perturbation single-cell transfer-learning

Predicting Cellular Responses to Novel Drug Perturbations at a Single-Cell Resolution

Code accompanying the NeurIPS 2022 paper (PDF).

architecture of CCPA

Our talk on chemCPA at the M2D2 reading club is available here. A previous version of this work was a spotlight paper at ICLR MLDD 2022. Code for this previous version can be found under the v1.0 git tag.

Codebase overview

For the final models, we provide weight checkpoints as well as the hyperparameter configuration. The raw datasets can be downloaded from a FAIR server. We also provide our processed datasets for reproducibility: sci-Plex shared gene set & extended gene set, LINCS. Embeddings can be downloaded here.

To setup the environment, install conda and run:

conda env create -f environment.yml
python setup.py install -e .

All experiments where run through seml. The entry function is ExperimentWrapper.__init__ in chemCPA/seml_sweep_icb.py. For convenience, we provide a script to run experiments manually for debugging purposes at chemCPA/manual_seml_sweep.py. The script expects a manual_run.yaml file containing the experiment configuration.

All notebooks also exist as Python scripts (converted through jupytext) to make them easier to review.

Some of the notebooks use a drugbank_all.csv file, which can be downloaded from here (registration needed).

Citation

You can cite our work as:

@inproceedings{hetzel2022predicting,
  title={Predicting Cellular Responses to Novel Drug Perturbations at a Single-Cell Resolution},
  author={Hetzel, Leon and Böhm, Simon and Kilbertus, Niki and Günnemann, Stephan and Lotfollahi, Mohammad and Theis, Fabian J},
  booktitle={NeurIPS 2022},
  year={2022}
}