Note: a more modern implementation (in jax) for more general usage is now online at https://github.com/MLDS-NUS/onsagernet-jax. This current repository is purely for archival purposes.
Code for reproduction of results in manuscript "Constructing Custom Thermodynamics Using Deep Learning".
Datasets are available in the Harvard Dataverse public repository https://doi.org/10.7910/DVN/NRIX7Y
Description of the data contents
config_train_mean_every.pkl
: high-dimensional training data, consisting of 610 trajectories of polymer chains configurations (300 x 3 coordinates) over 1001 time steps (split into 5 files due to size, can be combined with software such as 7-zip)config_test_mean_every.pkl
: high-dimensional test data, consisting of 110 trajectories of polymer chains configurations (300 x 3 coordinates) over 1001 time stepsDNA_data_resnet.pkl
: low-dimensional training and test data (3-dimensional)ex_train.pkl
: chain extension (Z1) for training dataex_test.pkl
: chain extension (Z1) for test dataconfig_fast_mean_test.pkl
: high-dimensional test data, consisting of 500 trajectories (same initial configuration) of polymer chains configurations (300 x 3 coordinates) over 1001 time steps with a fast unfolding timeconfig_middle_mean_test.pkl
: high-dimensional test data, consisting of 500 trajectories (same initial configuration) of polymer chains configurations (300 x 3 coordinates) over 1001 time steps with a middle unfolding time (split into 4 files due to size, can be combined with software such as 7-zip)config_slow_mean_test.pkl
: high-dimensional test data, consisting of 500 trajectories (same initial configuration) of polymer chains configurations (300 x 3 coordinates) over 1001 time steps with a slow unfolding timeDumbbell.tif
: raw experimental video images for molecule that takes on dumbbell configurationDumbbell.png
: processed experimental video images for molecule that takes on dumbbell configurationFolded.tif
: raw experimental video images for molecule that takes on folded configurationFolded.png
: processed experimental video images for molecule that takes on folded configurationconfig_coord.pkl
: high-dimensional data for experimental images of molecules in stretched configurations from literatureX_low1.txt
: low-dimensional data for experimental images of molecules in stretched configurations from Soh et al. (2023)X_low2.txt
: low-dimensional data for experimental images of molecules in stretched configurations from Soh et al. (2018)The code was tested on
(Typical install time: <5 minutes)
Create a virtual environment
virtualenv -p python3 ~/.venvs/<env_name>
Activate the environment
source ~/.venvs/<env_name>/bin/activate
Install required python packages
pip install -r requirements.txt
Before running the codes, download the relevant - Simulation
or Experimental
datasets from https://doi.org/10.7910/DVN/NRIX7Y and save them into the ./data
directory.
(Typical run time: 15 minutes for the default 5000 iterations)
A quick demo of the code is provided in ./demo/
.
This demo is the copy of the code used to obtain the results presented in the paper, but the number of trajectories provided is limited to 250 and 50 for training and testing, respectively. This, along with the default 5000 iterations (this is user defined) ensures that the code can be run on most personal laptop or desktop computers in a short amount of time. The limited amount of data means that the prediction accuracy will differ from that obtained with full dataset, and the number of iterations chosen by user will also affect the quality of the predictions.
Run the demo by issuing
python ./demo/demo_OnsagerNet.py
The code will create an output folder, you can choose the name of this folder. If you do not input a name and press enter in the terminal, a default ./demo_outputs/
folder will be created.
Expected output in the demo output folder are the model checkpoints.
The full reproduction code for the training of Stochastic OnsagerNet is provided ./reproduction/
.
(Typical run time: about 4 hours)
Train the model by running the Jupyter notebook /reproduction/OnsagerNet.ipynb
The model checkpoints are saved to ./reproduction/checkpoints/
(Typical run time: about 10 minites)
To produce main results presented in the paper
./reproduction/prediction/predict_fast.ipynb
./reproduction/prediction/predict_middle.ipynb
./reproduction/prediction/predict_slow.ipynb
./reproduction/physical meaning/physical_meaning.ipynb
./reproduction/potential/potential_Z1Z2.ipynb
./reproduction/potential/potential_Z1Z3.ipynb
./reproduction/potential/potential_Z2Z3.ipynb
For reproducibility, the default loads pre-trained checkpoints in ./reproduction/saved_checkpoints/
. Alternatvely, you may load your trained checkpoints in ./reproduction/checkpoints/
, which may result in slight statistical variations in the results.