Closed hyeonahkimm closed 2 months ago
Hello @hyeonahkimm ,
reinvent-benchmarking
repo are the ones used in the manuscript. Gary
Thanks for the quick response.
example.py
, there is no error, but it does not finish. I found that the following computation has an issue (in the pce.get_properties
).
command = 'CHARGE={};xtb {} --opt normal -c $CHARGE --iterations 4000 > out_dump'.format(charge, 'crest_best.xyz')
system(command)
I have a similar issue in tadf (in tadf.xtb()
) - the process is not finished.
I might have missed some settings related to xtb
(I sat environment variables XTBHOME
following the local installation guideline in docs/getting_started.rst
.For docking and reactivity tasks, I've encountered the following errors.
FYI, during docking evaluation, lig
files are properly generated and removed, while pose
files are not generated.
1syh
and 6y2f
with qvina
using a Linux machine (Ubuntu 22.04). And I ran chmod 777 tartarus/data/qvina
and smina
.I figured out the issue in pce
and tadf
- it was because of the wrong crest installation.
crest_best.xyz
was empty, but now it is properly generated.
Now, they take 212 sec. and 22 sec. for a single evaluation (example.py
)
PCE1: -3.908 PCE2: -7.605 Singlet-triplet: -0.840 Oscillator strength: 0.019 Combined obj: -2.161
Hi @hyeonahkimm,
I highly recommend reviewing the Methods Overview section of the manuscript. Specifically, this paragraph is important; however, please read the entire section for further clarification:
When using TARTARUS, the following procedures should be adopted to obtain benchmark results
that are consistent with the ones provided herein. The first step for running one of the benchmarks, if
necessary, is to train the generative model on the provided dataset. For all the ML models, we used
the first 80% of the reference molecules for training and the remaining 20% for hyperparameter
optimization. Then, the (trained) model is tasked with proposing structures to be evaluated by the
objective function of the corresponding benchmark task. Notably, structure optimization was always
initiated using the best reference molecule from the corresponding dataset. For the benchmarks
concerned with designing photovoltaics, organic emitters, and protein ligands, structure optimization
was carried out with a population size of 500 and a limit of 10 iterations, leading to a maximum
number of 5,000 proposed compounds overall. For the design of chemical reaction substrates, we
used the same maximum number of proposed compounds but used a population size of 100 and
limited the number of iterations to 50 instead. Additionally, the associated run time was limited to
24 hours, which resulted in termination for several molecular design runs before reaching 5,000
molecule evaluations. Furthermore, to increase robustness and reproducibility of our results, we
repeated each optimization run five times, allowing us to report the corresponding outcomes with both
an average and a standard deviation. We believe that this resource-constrained comparison approach
is necessary for fairly comparing methods and should be used as a standard by the community. A
detailed account of the parameters and settings used for running each of the models is provided in the
Computational Details section of the Supporting Information.
Additionally, it seems that the calculations are being performed during training. Please correct me if I am mistaken. Note that this approach is incorrect and can significantly increase the training time for any of the models. The evaluation (with calls to the specific tasks) are only meant to be after completion of the training procedure, as highlighted in the manuscript.
Regarding the docking objective, I strongly suspect that the QuickVina2 executable is not working or that the molecule generated is extremely unstable/infeasible, preventing successful calculations. Could you please try the following:
Thanks for all the help.
Training process
Thanks for sharing the guidelines. To run reinvent, I'm using the pretraining code from reinvent-benchmarking
with the provided dataset, and it seems to follow the same procedure (80% for training and 20% for validation)
The results in my previous comment were obtained by running the provided example file, not from training.
docking objectives
I tried to run ./qvina
directly and found the file directory issue (this error was not printed when I ran the provided example.py
because of exception handling).
I addressed the error by changing the receptor file directory in docking.py
.
./docking_structures/1syh/prot.pdbqt
-> ./tartarus/docking_structures/1syh/prot.pdbqt
Now, I can see that pose files are properly generated (and removed) and scores are returned. C1=NC2=C(N=C1)C(=CC=N2)C1=CC=NC2=C1N=CN=C2 qvina 1syh docking score: -5.9 smina 1syh docking score: -5.5
I appreciate your quick and kind responses.
Hi,
I have tried to reproduce the experiments in the paper using REINVENT. Based on the jupyter notebook and code (the provided reinvent-benchmarking GitHub repo). Since minor errors occurred, I slightly modified the code (mostly related to the custom scoring function).
Nevertheless, I still have issues running codes (I only change the batch size from 500 to 100),
ps. I tested using AMD EPYC 7542 32-Core Processor (128) CPUs.
It'll be greatly helpful if you share the codes to reproduce the result in the paper.
Thanks,