deargen / mt-dti

An official Molecule Transformer Drug Target Interaction (MT-DTI) model
MIT License
34 stars 20 forks source link
affinity drug drug-discovery drug-target-interactions dti molecule-transformer mt-dti protein

MT-DTI

An official Molecule Transformer Drug Target Interaction (MT-DTI) model

Required Files

cd mt-dti
# place the downloaded file (data.tar.gz) at "mt-dti"
tar xzfv data.tar.gz
mt-dti/data/chembl_to_cids.txt
mt-dti/data/CID_CHEMBL.tsv
mt-dti/data/kiba/*
mt-dti/data/kiba/folds/*
mt-dti/data/kiba/mbert_cnn_v1_lr0.0001_k12_k12_k12_fold0/*
mt-dti/data/kiba/tfrecord/*.tfrecord
mt-dti/data/pretrain/*
mt-dti/data/pretrain/mbert_6500k/*

VirtualEnv

mkvirtualenv --python=`which python3` dti
pip install tensorflow-gpu==1.12.0

Preprocessing

python kiba_to_pkl.py 

# Resulted files
mt-dti/data/kiba/kiba_b.cpkl
cd src/preprocess
export PYTHONPATH='../../'
python tfrecord_writer.py 

# Resulted files
mt-dti/data/kiba/tfrecord/*.tfrecord

PreTraining

$ head CID-SMILES
1   CC(=O)OC(CC(=O)[O-])C[N+](C)(C)C
2   CC(=O)OC(CC(=O)O)C[N+](C)(C)C
3   C1=CC(C(C(=C1)C(=O)O)O)O
4   CC(CN)O
5   C(C(=O)COP(=O)(O)O)N
6   C1=CC(=C(C=C1[N+](=O)[O-])[N+](=O)[O-])Cl
7   CCN1C=NC2=C(N=CN=C21)N
8   CCC(C)(C(C(=O)O)O)O
9   C1(C(C(C(C(C1O)O)OP(=O)(O)O)O)O)O
cd src/pretrain
export PYTHONPATH='../../'
python tfrecord_smiles.py 
# for example
gs://your_gs/mbert/tfr/smiles.001
gs://your_gs/mbert/tfr/smiles.002
...
cd src/pretrain
export PYTHONPATH='../../'
python pretrain_smiles_tpu.py
# for example
gs://your_gs/mbert/pretrain-mini/model.ckpt-6500000.*

Result

INFO:tensorflow:Saving checkpoints for 6500000 into gs://bdti/mbert/pretrain/model.ckpt.
INFO:tensorflow:loss = 0.098096184, step = 6500000 (48.736 sec)
INFO:tensorflow:global_step/sec: 20.5185
INFO:tensorflow:examples/sec: 10505.5
INFO:tensorflow:Stop infeed thread controller
INFO:tensorflow:Shutting down InfeedController thread.
INFO:tensorflow:InfeedController received shutdown signal, stopping.
INFO:tensorflow:Infeed thread finished, shutting down.
INFO:tensorflow:infeed marked as finished
INFO:tensorflow:Stop output thread controller
INFO:tensorflow:Shutting down OutfeedController thread.
INFO:tensorflow:OutfeedController received shutdown signal, stopping.
INFO:tensorflow:Outfeed thread finished, shutting down.
INFO:tensorflow:outfeed marked as finished
INFO:tensorflow:Shutdown TPU system.
INFO:tensorflow:Loss for final step: 0.098096184.
INFO:tensorflow:training_loop marked as finished

mini model
INFO:tensorflow:***** Eval results *****
INFO:tensorflow:  global_step = 6500000
INFO:tensorflow:  loss = 0.15356757
INFO:tensorflow:  masked_lm_accuracy = 0.94406235
INFO:tensorflow:  masked_lm_loss = 0.1413514

FineTuning

cd src/finetune
export PYTHONPATH='../../'
python finetune_demo.py 

Prediction

cd src/predict
export PYTHONPATH='../../'
python predict_demo.py