This repository contains the codes for the paper TEIM: Characterizing the interaction conformation between T-cell receptors and epitopes with deep learning published in Nature Machine Intelligence.
TEIM (TCR-Epitope Interaction Modeling) is a deep learning-based model to predict the TCR-epitope interactions, including two submodels TEIM-Res (TEIM at Residue level) and TEIM-Samp (TEIM at Sequence level).
Both models only takes the primary sequences of CDR3βs and the epitopes as input. TEIM-Res predicts the distances and the contact probabilities between all residue pairs of CDR3βs and epitopes. TEIM-Seq predicts whether the CDR3βs and epitopes can bind to each other.
Install basic packages using:
# [Optional] Create a new environment and activate it
conda create -n teim python=3.8
conda activate teim
# Install Pytorch packages (for CUDA 11.3)
conda install pytorch==1.10.1 cudatoolkit=11.3 -c pytorch -c conda-forge
# Install other packages
pip install -r requirements.txt
Note: Change the Pytorch version to be compatible with your CUDA version. Besides, since the Pytorch Lightning version we used is 1.6.4, the compatible Pytorch version is $>=1.8,<=1.11$ (see here).
conda install -c bioconda anarci
We also provided a docker file to facilitate the installation of environment. You can build the docker by runing
docker build -t teim:v1 .
Put your input TCR-epitope sequence pairs in the inputs/inputs.csv file. The TCRs are represented by their CDR3β sequences and the epitopes are represented by their sequences in the following format: |
cdr3 | epitope |
---|---|---|
CASAPGLAGGRPEQYF | LLFGYPVYV | |
CASRGAAGGRPQYF | MLWGYLQYV | |
CASRPGLAGGRAEQYF | FTDSSVWA |
Run
python scripts/inference_res.py
The predicted distance matrices and contact site matrices are in the outputs
directory:
dist_<cdr3>_<epitope>.csv
and site_<cdr3>_<epitope>.csv
, respectively. inputs/inputs_bd.csv
file. The format is the same as inputs/inputs.csv
(residue-level input file).python scripts/inference_seq.py
outputs/sequence_level_binding.csv
. The binding
column in the file represent the predicted sequence-level binding scores (probabilities) of the TCR-epitope pair.Please refer to the directory train_teim
.
@article{Peng2023,
doi = {10.1038/s42256-023-00634-4},
url = {https://doi.org/10.1038/s42256-023-00634-4},
year = {2023},
month = mar,
publisher = {Springer Science and Business Media {LLC}},
volume = {5},
number = {4},
pages = {395--407},
author = {Xingang Peng and Yipin Lei and Peiyuan Feng and Lemei Jia and Jianzhu Ma and Dan Zhao and Jianyang Zeng},
title = {Characterizing the interaction conformation between T-cell receptors and epitopes with deep learning},
journal = {Nature Machine Intelligence}
}
If you have any questions, please contact us at xingang.peng@gmail.com