aspuru-guzik-group / Tartarus

A Benchmarking Platform for Realistic And Practical Inverse Molecular Design
https://arxiv.org/abs/2209.12487
72 stars 7 forks source link

About the datasets/docking.csv #12

Open Lyu6PosHao opened 1 month ago

Lyu6PosHao commented 1 month ago

Hello, thanks for your great work.

I have a question: In both the paper and the tutorial, the number of smiles of the docking.csv is recorded as 152296. However, the number of smiles in docking.csv is actually 105338.

It is a little confused. Is it a typo?

By the way, is the score in docking.csv calculated by mode=qvina or mode=smina? Which mode should I choose?

Thanks

akshat998 commented 1 month ago

Hi @Lyu6PosHao,

Yes, that was a typo! During the revisions, we added additional filters to enhance the synthesizability of the molecules, which led to a decrease in their numbers. It looks like we overlooked updating this number in the manuscript.

All generative models were run with QVina, and Smina was used to re-score the top molecule.

Just a quick side note: for the docking tasks, we recommend running calculations for 1SYH and 4LDE. We’re currently re-working 6Y2F since the calculations aren’t very stable :)

Thanks! Akshat

Lyu6PosHao commented 4 weeks ago

I see, thanks a lot!