isayev / ASE_ANI

ANI-1 neural net potential with python interface (ASE)
MIT License
220 stars 56 forks source link

ANI-2x training data set #39

Open JMorado opened 3 years ago

JMorado commented 3 years ago

Hi,

How does one know what was the exact data set used to train ANI-2x?

In the original ANI-2x paper, it is said that the training data set is composed of molecules from a variety of sources, including the GDB-11 database, the CheMBL database, the s66x8 benchmark, and some randomly generated amino acids and dipeptides. Nevertheless, from what I understood, these data sets are not included integrally because some specific sampling techniques are then employed.

Is it possible to know which were the exact molecules used for training?

Thank you. Best, João