keiserlab / e3fp

3D molecular fingerprints
GNU Lesser General Public License v3.0
121 stars 33 forks source link

ValueError trying to convert to rdkit fingerprint #23

Closed pyeguy closed 7 years ago

pyeguy commented 7 years ago

Hi All,

Cool project, looking forward to messing around with these fingerprints. Getting an issue when trying to convert from e3fp to rdkit fprint format.

here's a oneliner to reproduce

from e3fp import pipeline
pipeline.fprints_from_smiles('CCC(C)C(N)C(=O)O','tst_mol')[0].to_rdkit()

which gives me the following traceback:

C:\Anaconda3\envs\e3fp\lib\site-packages\e3fp\fingerprint\fprint.py in to_rdkit(self)
    448
    449         rdkit_fprint = rdkit_fp_type(bits)
--> 450         rdkit_fprint.SetBitsFromList(indices.tolist())
    451         return rdkit_fprint
    452

ValueError: cannot extract desired type from sequence

I'm on windows 10, python 3.6 with the following rdkit & boost;

boost                     1.59.0                   py36_3    rdkit
rdkit                     2017.03.3           np111py36_1    rdkit
sethaxen commented 7 years ago

Thanks for pointing this out. This appears to be a quirk of rdkit bitvectors, where they may be created at size 2^32 - 1 but only bits less than 2^31 - 1 may be set. The commit should fix that. Note that this means that roughly 50% of indices will be folded to fit into an rdkit bitvector. If you're still experiencing the problem, please reopen the issue.

pyeguy commented 7 years ago

Nice detective work, that fixed the problem. Thanks!