openforcefield / openff-benchmark

Comparison benchmarks between public force fields and Open Force Field Initiative force fields
MIT License
10 stars 2 forks source link

Code refact and merge of old and new analyses #95

Closed ldamore closed 3 years ago

ldamore commented 3 years ago

Description

This PR introduces two additional benchmark analysis proposed by B. Swope and X. Lucas.

Installation of analysis command group in a new conda environment

First, follow the installation procedures for the openff-benchmark-optimization environment described in Deployment Procedure document

Once this is done, you can clone the environment into a new conda environment:

conda create -n openff-analysis --clone openff-benchmark-optimization
conda activate openff-analysis

Install the analysis branch from github:

git clone https://github.com/openforcefield/openff-benchmark
cd openff-benchmark
git checkout --track origin/analysis
pip install -e .

General comments

The two new analysis are typically executed at point (5) of the Optimization Benchmark Protocol

With openff-benchmark report swope the analysis proposed by B. Swope is executed. The command accepts the paths of the optimized molecules obtained from the optimization step. Additionally, the reference method (b3lyp-d3bj by default) and an output directory can be specified.

  openff-benchmark report swope --input-path 4-compute-qm-filtered --input-path 4-compute-mm-filtered --ref-method b3lyp-d3bj --output-directory 5-results-swope

The command creates one output csv file per method, which are named swope_<method>.csv, i.e. swope_openff-1.0.0.csv

The analysis proposed by B. Swope operates as follows:

With openff-benchmark report lucas the analysis proposed by X. Lucas is executed. The command accepts the paths of the optimized molecules obtained from the optimization step. Additionally, the reference method (b3lyp-d3bj by default) and an output directory can be specified.

openff-benchmark report lucas --input-path 4-compute-qm-filtered --input-path 4-compute-mm-filtered --ref-method b3lyp-d3bj --output-directory 5-results-lucas

The command creates one output csv file per method, which are named lucas_<method>.csv, i.e. lucas_openff-1.0.0.csv

The analysis proposed by X. Lucas operates as follows:

The final openff-benchmark report plots-swope and openff-benchmark report plots-lucas commands take the directories containing the csv files as an input (output of 5-results-swope or 5-results-lucas).

For 5-results-swope an rmsd-cutoff and de-cutoff should be set.

openff-benchmark report plots-swope --input-path 5-results-swope/ --de-cutoff 1.5 --rmsd-cutoff 1.0

The algorithm will generate a ridge plot of all the conformers within the rmsd-cutoff for a range of dE values, and another ridge plot of all the conformers within the de-cutoff for a range of rsmd values.

For 5-results-lucas

openff-benchmark report plots-lucas --input-path 5-results-lucas/

The algorithm will generate similar plots as for compare-forcefields and match_minima

Please note

Likewise openff-benchmark report match-minima, also openff-benchmark report lucas matches the conformers by rmsd and this step is quite time consuming. However, the intersection method added in this PR now allows the user to run the analysis on each different FF method as separate task, speeding up all the process. e.g.

for mm_path in `ls -d 4-compute-mm-filtered/*`; do 
   openff-benchmark report match-minima --input-path 4-compute-qm-filtered \
                                        --input-path $mm_path \
                                       --ref-method b3lyp-d3bj \
                                        --output-directory 5-match-minima &  done

In addition, the QM-to-QM comparison will be skipped. Please note that now the plot commands runs without specifying the reference method, e.g.

openff-benchmark report plots-lucas --input-path 5-results-lucas/

Todos

Questions

Status

codecov-commenter commented 3 years ago

Codecov Report

Merging #95 (19865a4) into season-1 (7b5a34b) will decrease coverage by 2.39%. The diff coverage is 11.05%.