wengong-jin / hgraph2graph

Hierarchical Generation of Molecular Graphs using Structural Motifs
MIT License
367 stars 108 forks source link

PicklingError("Can't pickle <class 'Boost.Python.ArgumentError'>: import of module 'Boost.Python failed")' #33

Closed orubaba closed 1 year ago

orubaba commented 2 years ago

Hi experts, kindly help out with solution to this error I'm getting. I want to generate vocabs for my Transition metal complexes dataset as shown below:

(/mnt/c/Users/User/Desktop/mol-generation/env) aorubuloye@ORUBULOYE-PC:/mnt/c/Users/User/Desktop/mol-generation/hgraph2graph$ python get_vocab.py --ncpu 16 < data/catalystchem/all.txt > vocab_2.txt

**

Traceback (most recent call last): File "/mnt/c/Users/User/Desktop/mol-generation/hgraph2graph/get_vocab.py", line 32, in vocab_list = pool.map(process, batches) File "/mnt/c/Users/User/Desktop/mol-generation/env/lib/python3.9/multiprocessing/pool.py", line 364, in map return self._map_async(func, iterable, mapstar, chunksize).get() File "/mnt/c/Users/User/Desktop/mol-generation/env/lib/python3.9/multiprocessing/pool.py", line 771, in get raise self._value multiprocessing.pool.MaybeEncodingError: Error sending result: '<multiprocessing.pool.ExceptionWithTraceback object at 0x7f38017d2d00>'. Reason: 'PicklingError("Can't pickle <class 'Boost.Python.ArgumentError'>: import of module 'Boost.Python' failed")'

**

finlayiainmaclean commented 2 years ago

I fixed this by changing Line 5 in get_vocab.py to from multiprocessing.dummy import Pool

max-unfried commented 2 years ago

I had this issue too - what worked for me(don't know why) was that i had the file originally saved as a UTF-16 Unicode Text (.txt). Changing the format to Tab-delimited Text(.txt) and saving it as this made the error go away.

orubaba commented 1 year ago

I had this issue too - what worked for me(don't know why) was that i had the file originally saved as a UTF-16 Unicode Text (.txt). Changing the format to Tab-delimited Text(.txt) and saving it as this made the error go away.

JonathanBroadbent commented 10 months ago

Hi @orubaba and @max-unfried, would you mind running file -i <input_smiles.txt> and commenting what encoding you have for you input text files. I saved my input file as tab-delimited text file (.txt) yet still receive the same issue.

My encoding is charset=us-ascii

orubaba commented 10 months ago

ligand_mini.txt: text/plain; charset=us-ascii. @JonathanBroadbent

JonathanBroadbent commented 10 months ago

Thanks Adeshina,

I was able to debug my issue. It wasn't a encoding error, I had an incorrect SMILES string in my dataset. As a fix I added this to mol_graph.py line 137:

if mol is None:
    Exception(f"Malformed SMILES string in dataset:\n{self.smiles}")
orubaba commented 10 months ago

glad you found a way. i remember now, seeing this, i had to check my smiles too. seems some had "." inbetween the letters denoting 2 smiles string on same line. e.g CN1CCN(C)CCN(CC1)C.CN1CCN(C)CCN(CC1)C would give an error.