The code in this repository accompanies the preprint "Force-Guided Bridge Matching for Full-Atom Time-Coarsened Dynamics of Peptides'', by Ziyang yu, Wenbing Huang and Yang Liu.
Our code works well on Linux with CUDA==11.7
. It will be required to install pytorch
and other dependencies listed below with the corresponding CUDA version (if necessary) on your server.
biopython
mdtraj
openmm
rdkit
torch_scatter
tqdm
e3nn
requests
The PDB file and coordinates of MD trajectories (.xtc format) of AD can be downloaded from mdshare.
To curate our PepMD dataset, you should first download pdb_seqres.fasta
file from PDB. We have provided the script to download and process raw PDB data in data/download.py
, please run:
cd data && python download.py --fasta ./pdb_seqres.fasta --save_dir /your/saving/directory
where you should specify a saving directory for PDB and processed peptide files. If all scripts work successfully, there will be summary.jsonl
and summary-post.jsonl
in your saving directory now.
Then MD simulations can be performed with simulation.py
:
python simulation.py --summary /your/saving/directory/summary-post.jsonl --temp 300 --spacing 1000 --gpu 0
where you can specify your own configurations, such as temperature --temp
(unit: Kelvin) and frame spacing --spacing
(unit: fs). The script will create a directory sim
under /your/saving/directory
where each peptide has its own sub-directory named by PDB id, including a {PDB-id}_{chain-id}-traj-arrays.npz
file containing coordinates, velocities, forces and energies of MD trajectories and a state0.pdb
file.
Now we can curate the dataset with train/test splits:
python dataset.py --sim_dir /your/saving/directory/sim --delta 500
where you can specify the coarsened time for prediction --delta
(unit: ps). Afterwards there will be train.jsonl
and test.jsonl
under /your/saving/directory
for training and evaluation.
Before running training scripts, first compile TorchMD extensions with:
python setup.py build_ext --inplace
Then you can use the script train.sh
for training both FBM-base and FBM with multi GPUs. Note that you should first replace DATA_DIR
in the file with /your/saving/directory
. You can run the following script to train FBM-base with GPU 0, 1:
GPU=0,1 bash train.sh
For training FBM, please modify the configuration --model_type bbm
to --model_type fbm
and add another line including --baseline /path/to/FBM-base/checkpoint
, where you should replace with the checkpoint file path (.ckpt
) of FBM-base.
We have provided different evaluation scripts for various usage.
If you only want to inference trajectories without evaluation, please run:
python inference.py --name {any_name_for_identity} --test_set /path/to/state0.pdb --ckpt /path/to/checkpoint --save_dir /path/to/saving/results --inf_step 1000 --sde_step 30 --guidance 0.05 --gpu 0
where --name
is only used to create sub-directory under --save_dir
for saving generated trajectories. --test_set
is the path to the initial PDB file you are interested in, --ckpt
specifies the path to the checkpoint of FBM-base or FBM, --inf_step
specifies the trajectory length and --sde_step
indicates discrete-time step $T$. If you use the FBM model for inference, it's required to add --guidance
to specify the guidance strength.
If you want to evaluate any generated trajectories with MD trajectories, please run:
python evaluate.py --top /path/to/state0.pdb --ref /path/to/MD/trajectories --model /path/to/generated/trajectories
Here --top
specifies the .pdb
file that describes the topology, --ref
and --model
specify trajectories generated by MD and the model respectively. We support multiple format for --ref
and --model
, including: .pdb
, .xtc
, .npz
consisting of the key "positions", .npy
.
If you want to inference and evaluate the model on the test set of PepMD (or any other test sets), please run:
python evaluate_all.py --name none --test_set /your/saving/directory/test.jsonl --ckpt /path/to/checkpoint --save_dir /path/to/saving/results --inf_step 1000 --sde_step 30 --guidance 0.05 --gpu 0
MIT
If you find our code useful in your research, please cite the following paper:
@misc{yu2024force,
title={Force-Guided Bridge Matching for Full-Atom Time-Coarsened Dynamics of Peptides},
author={Yu, Ziyang and Huang, Wenbing and Liu, Yang},
year={2024},
eprint={2408.15126},
archivePrefix={arXiv},
}