wengong-jin / hgraph2graph

Hierarchical Generation of Molecular Graphs using Structural Motifs
MIT License
367 stars 108 forks source link

Error raise when run preprocess.py in generation folder #9

Closed WhatAShot closed 4 years ago

WhatAShot commented 4 years ago

Hi, I run preprocess.py in generation folder and an error raise:

python preprocess.py --train ../data/polymers/train.txt --vocab ../data/polymers/inter_vocab.txt --ncpu 8 

multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/home/was/.conda/envs/torch14/lib/python3.8/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/was/.conda/envs/torch14/lib/python3.8/multiprocessing/pool.py", line 48, in mapstar
    return list(map(*args))
  File "preprocess.py", line 19, in tensorize
    x = MolGraph.tensorize(mol_batch, vocab, common_atom_vocab)
  File "/data/was/capsulegraphvae/hg2g/generation/poly_hgraph/mol_graph.py", line 169, in tensorize
    tree_tensors, tree_batchG = MolGraph.tensorize_graph([x.mol_tree for x in mol_batch], vocab)
  File "/data/was/capsulegraphvae/hg2g/generation/poly_hgraph/mol_graph.py", line 210, in tensorize_graph
    fnode[v] = vocab[attr]
  File "/data/was/capsulegraphvae/hg2g/generation/poly_hgraph/vocab.py", line 43, in __getitem__
    return self.hmap[x[0]], self.vmap[x]
KeyError: 'O=C1NC(=O)C2=C(F)C=C3C(=O)NC(=O)C4=C3C2=C1C=C4F'
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "preprocess.py", line 48, in <module>
    all_data = pool.map(func, batches)
  File "/home/was/.conda/envs/torch14/lib/python3.8/multiprocessing/pool.py", line 364, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/home/was/.conda/envs/torch14/lib/python3.8/multiprocessing/pool.py", line 768, in get
    raise self._value
KeyError: 'O=C1NC(=O)C2=C(F)C=C3C(=O)NC(=O)C4=C3C2=C1C=C4F'

but I cannot find the smiles "O=C1NC(=O)C2=C(F)C=C3C(=O)NC(=O)C4=C3C2=C1C=C4F" in the texts. Do you know what happens and how to debug? Thank you in advance.

WhatAShot commented 4 years ago

Don't use the inter_vocab.txt and re-generate vocab.txt, then the errors will not raise.