wengong-jin / abdockgen

MIT License
104 stars 22 forks source link

Could you please give us more detail? #5

Closed partrita closed 1 year ago

partrita commented 1 year ago

unzip the train_data.jsonl.gz file

$git clone https://github.com/wengong-jin/abdockgen.git
$cd abdockgen
$gunzip data/rabd/train_data.jsonl.gz

Install dependancy

I use mamba forge to python package manager, therefore

$mamba create -n abdockgen
$mamba activate adbdockgen
$(abdockgen)mamba install pytorch torchvision torchaudio pytorch-cuda=11.8 biopython matplotlib tqdm pdbfixer biotite -c pytorch -c nvidia

DockQ install

$(abdockgen)git clone https://github.com/bjornwallner/DockQ/
$(abdockgen)cd DockQ
$(abdockgen)make

After that I executed training script.

$(abdockgen)cd ..
$(abdockgen)python dock_train.py --hierarchical --L_target 20 --save_dir ckpts/HERN-dock

outputs

Epoch 19, Ligand RMSD = 24.054, All atom RMSD = 11.635
100%|███████████████████████████████████████████| 58/58 [00:01<00:00, 30.10it/s]
Test Ligand RMSD = 23.981, All atom RMSD = 11.677

dock CDR-H3 paratopes onto their corresponding epitopes

$(abdockgen)mkdir outputs
$(abdockgen)python predict.py ckpts/HERN_dock.ckpt data/rabd/test_data.jsonl

It produced PDB files like this.

outputs/
├── 1a14_pred.pdb
├── 1a2y_pred.pdb
├── 1fe8_pred.pdb
├── 1ic7_pred.pdb
├── 1iqd_pred.pdb
├── 1n8z_pred.pdb
├── 1ncb_pred.pdb
├── 1osp_pred.pdb
├── 1uj3_pred.pdb
├── 2adf_pred.pdb
├── 2b2x_pred.pdb
├── 2cmr_pred.pdb
├── 2dd8_pred.pdb
├── 2ghw_pred.pdb
├── 2vxt_pred.pdb
├── 2xqy_pred.pdb
├── 2xwt_pred.pdb
├── 2ypv_pred.pdb
├── 3bn9_pred.pdb
├── 3cx5_pred.pdb
├── 3ffd_pred.pdb
├── 3hi6_pred.pdb
├── 3k2u_pred.pdb
├── 3l95_pred.pdb
├── 3mxw_pred.pdb
├── 3nid_pred.pdb
├── 3o2d_pred.pdb
├── 3rkd_pred.pdb
├── 3s35_pred.pdb
├── 3uzq_pred.pdb
├── 3w9e_pred.pdb
├── 4cmh_pred.pdb
├── 4dtg_pred.pdb
├── 4dvr_pred.pdb
├── 4etq_pred.pdb
├── 4ffv_pred.pdb
├── 4fqj_pred.pdb
├── 4g6j_pred.pdb
├── 4g6m_pred.pdb
├── 4h8w_pred.pdb
├── 4ki5_pred.pdb
├── 4lvn_pred.pdb
├── 4ot1_pred.pdb
├── 4qci_pred.pdb
├── 4xnq_pred.pdb
├── 4ydk_pred.pdb
├── 5b8c_pred.pdb
├── 5bv7_pred.pdb
├── 5d93_pred.pdb
├── 5d96_pred.pdb
├── 5en2_pred.pdb
├── 5f9o_pred.pdb
├── 5ggs_pred.pdb
├── 5hi4_pred.pdb
├── 5j13_pred.pdb
├── 5l6y_pred.pdb
├── 5mes_pred.pdb
└── 5nuz_pred.pdb

0 directories, 58 files

I was wondering how to do evaluate those docked structures by DockQ.

In DockQ repo, there is a short instruction like this.

./DockQ.py <model> <native>

but I don't have a native pdb file, right?

partrita commented 1 year ago

oh, i got it. each file has PDB ID in it. then this code should be work.

./DockQ.py 1a14_pred.pdb 1a14.pdb
partrita commented 1 year ago
./DockQ.py examples/1a14_pred.pdb examples/1a14.pdb -native_chain1 A H -model_chain1 A HTraceback (most recent call last):
  File "/home/fkt/Downloads/abdockgen/DockQ/./DockQ.py", line 732, in <module>
    main()    
    ^^^^^^
  File "/home/fkt/Downloads/abdockgen/DockQ/./DockQ.py", line 569, in main
    native=make_two_chain_pdb_perm(native,nat_group1,nat_group2)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/fkt/Downloads/abdockgen/DockQ/./DockQ.py", line 447, in make_two_chain_pdb_perm
    f.write(change_chain(pdb_chains[c],"A"))
                         ~~~~~~~~~~^^^
KeyError: 'A'
philippschw commented 1 year ago

Hi @partrita, I wish the author would have uploaded the script for reproducing the validation of docking pose quality. The output of HERN e.g.1a14_pred.pdb is only the antigen (chain A) and the CDR-H3 region (chain H). You are getting the error because you are comparing the output of HERN with the complete complex. Thus first you need to create the native pdb (groundtruth) in the same structure as the output from HERN. You can use the script that I created for this purpose. https://gist.github.com/philippschw/337360ea23a391ee557d12c04fce4cde

Hope it helps!

partrita commented 1 year ago

Hi @partrita, I wish the author would have uploaded the script for reproducing the validation of docking pose quality. The output of HERN e.g.1a14_pred.pdb is only the antigen (chain A) and the CDR-H3 region (chain H). You are getting the error because you are comparing the output of HERN with the complete complex. Thus first you need to create the native pdb (groundtruth) in the same structure as the output from HERN. You can use the script that I created for this purpose. https://gist.github.com/philippschw/337360ea23a391ee557d12c04fce4cde

Hope it helps!

Thank you! It help.