aspuru-guzik-group / chemical_vae

Code for 10.1021/acscentsci.7b00572, now running on Keras 2.0 and Tensorflow
Apache License 2.0
479 stars 178 forks source link

errors when run examples/intro_to_chemvae.ipynb #1

Open fazhiyang opened 6 years ago

fazhiyang commented 6 years ago

when I run the Decode several attempts part of the intro_to_chemvae.ipynb ,there come some errors

Searching molecules randomly sampled from 5.00 std (z-distance) from the point Found 0 unique mols, out of 0 SMILES Series([], Name: smiles, dtype: object)

AttributeError Traceback (most recent call last) D:\Anaconda3\envs\chemvae\lib\site-packages\PIL\ImageFile.py in _save(im, fp, tile, bufsize) 481 try: --> 482 fh = fp.fileno() 483 fp.flush()

AttributeError: '_idat' object has no attribute 'fileno'

During handling of the above exception, another exception occurred:

SystemError Traceback (most recent call last) D:\Anaconda3\envs\chemvae\lib\site-packages\IPython\core\formatters.py in call(self, obj) 334 method = get_real_method(obj, self.print_method) 335 if method is not None: --> 336 return method() 337 return None 338 else:

D:\Anaconda3\envs\chemvae\lib\site-packages\PIL\Image.py in _reprpng(self) 655 from io import BytesIO 656 b = BytesIO() --> 657 self.save(b, 'PNG') 658 return b.getvalue() 659

D:\Anaconda3\envs\chemvae\lib\site-packages\PIL\Image.py in save(self, fp, format, **params) 1928 1929 try: -> 1930 save_handler(self, fp, filename) 1931 finally: 1932 # do what we can to clean up

D:\Anaconda3\envs\chemvae\lib\site-packages\PIL\PngImagePlugin.py in _save(im, fp, filename, chunk) 819 820 ImageFile._save(im, _idat(fp, chunk), --> 821 [("zip", (0, 0)+im.size, 0, rawmode)]) 822 823 chunk(fp, b"IEND", b"")

D:\Anaconda3\envs\chemvae\lib\site-packages\PIL\ImageFile.py in _save(im, fp, tile, bufsize) 488 if o > 0: 489 fp.seek(o, 0) --> 490 e.setimage(im.im, b) 491 if e.pushes_fd: 492 e.setfd(fp)

SystemError: tile cannot extend outside image

<PIL.Image.Image image mode=RGBA size=1000x0 at 0x2100EA6F128>

beangoben commented 6 years ago

Hi this error is due to not decoding any molecules and so since you try to plot the resulting molecules (None) you get an error.

The non-decoding could be due to the z vector, or the smiles or the number of decoding attempts. What z-vector/molecule are you trying to decode?

fazhiyang commented 6 years ago

Thank you! I also found this reason, because I found this on my command line,

[10:45:10] SMILES Parse Error: syntax error for input: 'CC1ccc(c1cc@NH2)N[3H]2(CCO)C' [10:45:10] SMILES Parse Error: syntax error for input: 'Ccccccc(=O)[NHOH+]3CCC@=O- C' [10:45:10] SMILES Parse Error: syntax error for input: 'CCc1cN(=O=N+2(C[OC[NCO)n21)' [10:45:10] SMILES Parse Error: syntax error for input: 'Cc1C(=(=C@)[C@H]1Nc1)1 ='

I selected a few SMILES randomly from the csv file, what I used here is CC(=O)c1c(O)cccc1COc1ccccc1 and CCCN(CC)c1cc[nH+]c(C(=O)[O-])c1 . I also try to find the middle of the z-vector of the two SMILES, but the output is OC(C1cc(c(O)cccccc(F)F)c1c1 , this is obviously an invalid SMILES , so what should I do to get more valid SMILES output. Thanks!

beangoben commented 6 years ago

What do you mean by middle of the z-vector of two smiles?

To get more valid smiles, you should increase the number of decoding attempts (how many are you using?) or slightly increase the noise radius (how much are you using?).

muu4649 commented 5 years ago

HI !I don't know middle of the z-vector. But, Isn't Z-vector Z-distance? Z-distance is three -dimensional distance (euclidian metric of three material property values) ,right?