aspuru-guzik-group / chemical_vae

Code for 10.1021/acscentsci.7b00572, now running on Keras 2.0 and Tensorflow
Apache License 2.0
470 stars 178 forks source link

I am working in a new version of this project #59

Closed ghsanti closed 2 weeks ago

ghsanti commented 2 weeks ago

project was deleted. find an interesting fork here https://github.com/KnightTec/chemical_vae

FaSalih commented 2 weeks ago

You're a life saver! I was about to lose weeks off my life trying to upgrade all of this to a newer TensorFlow version.

ghsanti commented 2 weeks ago

Great, it's a complex env so if something fails let me know. Also, I removed TerminalGRU for the moment @FaSalih. In practical terms it means that you may get less accuracy, but probably a lot of algorithmic improvements from the newer packages' versions.

FaSalih commented 2 weeks ago

@ghsanti I am having a little trouble with the required python version. I'm get the following error when I try to run either of the poetry commands

(cvae) [fsalih@crcfe02 cvae]$ poetry install --without dev --sync # unless you want dev deps.

Current Python version (3.11.9) is not allowed by the project (~3.12.4).
Please change python executable via the "env use" command.
(cvae) [fsalih@crcfe02 cvae]$ poetry install --only-main

Current Python version (3.11.9) is not allowed by the project (~3.12.4).
Please change python executable via the "env use" command.

In the environment creation command you specify a python version below 3.12, but the pyproject.toml file specifies python = "~3.12.4" under [tool.poetry.dependencies].

What's the right python version? And do I need these poetry commands?

ghsanti commented 2 weeks ago

tl;dr: Try installing python 3.12 instead (using conda or mamba).

You don't need poetry necessarily the key is to have the deps from pyproject.toml this means you can also install it with pip. I use poetry to keep track of that process.

Edit: notebooks do run with some mods, but need a much larger noise value to get any new molecule; I'm unsure why yet, and whether it's a bug in my code or not.

ghsanti commented 2 weeks ago

@FaSalih I did the first test on the notebook (I just uploaded the trained model, so you should not need to train it to test.)

Feel free to open any question in the repo, I won't spam here anymore :-)

This is what I get (there are still things im unsure, but result makes sense imho.)

image

And the distances from 'O=C1Cc2[nH]ccc2N1':

               smiles   distance  count  frequency  \
0   O=C1Cc2[nH]ccc2N1   0.000004    359   0.637655   
1   O=C1Cc2[nH]ccc2C1   1.400370      4   0.007105   
2   O=C1Nc2[nH]ccc2N1   1.667291      1   0.001776   
3   O=C1Cc2[nH]ccc2O1   2.085738     53   0.094139   
4   O=C1Cc2[nH]ncc2N1   2.117460      2   0.003552   
5    O=C1Cc2[nH]ccc21   3.644159    128   0.227353   
6    O=c1Cc2[nH]ccc21   3.971362      2   0.003552   
7    O=C1Oc2[nH]ccc21   4.014803      1   0.001776   
8    O=C1Cc2[nH]cnc21   4.133598      2   0.003552   
9    O=C1Cc2[nH]ncc21   4.218965      1   0.001776   
10  O=C1C=2[nH]ccc2C1   4.328512      1   0.001776   
11   O=C1Ccc[nH]cccn1   4.531935      1   0.001776   
12  O=c1oc2[nH]ccc2O1   5.580686      1   0.001776   
13   O=C1CcC[nH]cccc1   5.901025      1   0.001776   
14  O=C1CC2[nH]cc2CC1   6.280135      1   0.001776   
15    O=C1Ocn[nH]cc1O   7.617425      1   0.001776   
16   O=CNCC1[n-]cccc1   8.793110      1   0.001776   
17    O=C1Cc2nonccc21   9.234937      1   0.001776   
18   O=C1CC=Cn2nccc21  10.604646      1   0.001776   
19   O=C1cc[nH]c(O)c1  11.585962      1   0.001776