The code in this repository accompanies the preprint "Force-Guided Bridge Matching for Full-Atom Time-Coarsened Dynamics of Peptides'', by Ziyang yu, Wenbing Huang and Yang Liu.
Our code works well on Linux with CUDA==11.7
. It will be required to install pytorch
and other dependencies listed below with the corresponding CUDA version (if necessary) on your server.
The PDB file and coordinates of MD trajectories (.xtc format) of AD can be downloaded from mdshare.
To curate our PepMD dataset, you should first download pdb_seqres.fasta
file from PDB. We have provided the script to download and process raw PDB data in data/
, please run:
cd data && python --fasta ./pdb_seqres.fasta --save_dir /your/saving/directory
where you should specify a saving directory for PDB and processed peptide files. If all scripts work successfully, there will be summary.jsonl
and summary-post.jsonl
in your saving directory now.
Then MD simulations can be performed with
python --summary /your/saving/directory/summary-post.jsonl --temp 300 --spacing 1000 --gpu 0
where you can specify your own configurations, such as temperature --temp
(unit: Kelvin) and frame spacing --spacing
(unit: fs). The script will create a directory sim
under /your/saving/directory
where each peptide has its own sub-directory named by PDB id, including a {PDB-id}_{chain-id}-traj-arrays.npz
file containing coordinates, velocities, forces and energies of MD trajectories and a state0.pdb
Now we can curate the dataset with train/test splits:
python --sim_dir /your/saving/directory/sim --delta 500
where you can specify the coarsened time for prediction --delta
(unit: ps). Afterwards there will be train.jsonl
and test.jsonl
under /your/saving/directory
for training and evaluation.
Before running training scripts, first compile TorchMD extensions with:
python build_ext --inplace
Then you can use the script
for training both FBM-base and FBM with multi GPUs. Note that you should first replace DATA_DIR
in the file with /your/saving/directory
. You can run the following script to train FBM-base with GPU 0, 1:
GPU=0,1 bash
For training FBM, please modify the configuration --model_type bbm
to --model_type fbm
and add another line including --baseline /path/to/FBM-base/checkpoint
, where you should replace with the checkpoint file path (.ckpt
) of FBM-base.
We have provided different evaluation scripts for various usage.
If you only want to inference trajectories without evaluation, please run:
python --name {any_name_for_identity} --test_set /path/to/state0.pdb --ckpt /path/to/checkpoint --save_dir /path/to/saving/results --inf_step 1000 --sde_step 30 --guidance 0.05 --gpu 0
where --name
is only used to create sub-directory under --save_dir
for saving generated trajectories. --test_set
is the path to the initial PDB file you are interested in, --ckpt
specifies the path to the checkpoint of FBM-base or FBM, --inf_step
specifies the trajectory length and --sde_step
indicates discrete-time step $T$. If you use the FBM model for inference, it's required to add --guidance
to specify the guidance strength.
If you want to evaluate any generated trajectories with MD trajectories, please run:
python --top /path/to/state0.pdb --ref /path/to/MD/trajectories --model /path/to/generated/trajectories
Here --top
specifies the .pdb
file that describes the topology, --ref
and --model
specify trajectories generated by MD and the model respectively. We support multiple format for --ref
and --model
, including: .pdb
, .xtc
, .npz
consisting of the key "positions", .npy
If you want to inference and evaluate the model on the test set of PepMD (or any other test sets), please run:
python --name none --test_set /your/saving/directory/test.jsonl --ckpt /path/to/checkpoint --save_dir /path/to/saving/results --inf_step 1000 --sde_step 30 --guidance 0.05 --gpu 0
If you find our code useful in your research, please cite the following paper:
title={Force-Guided Bridge Matching for Full-Atom Time-Coarsened Dynamics of Peptides},
author={Yu, Ziyang and Huang, Wenbing and Liu, Yang},