gnina / libmolgrid

Comprehensive library for fast, GPU accelerated molecular gridding for deep learning workflows
https://gnina.github.io/libmolgrid/
Apache License 2.0
137 stars 45 forks source link

RDKit failures if molgrid is imported first #65

Open RMeli opened 3 years ago

RMeli commented 3 years ago

I encountered an odd incompatibility between molgrid and rdkit which seems to depend on the order of import statements. I installed molgrid using pip in a conda environment where rdkit has been installed from the conda-forge channel.


The following snippet works as expected:

from rdkit import Chem
from rdkit.Chem import AllChem

m = Chem.MolFromSmiles('C1CCC1OC')
m2 = Chem.AddHs(m)
cids = AllChem.EmbedMultipleConfs(m2, numConfs=2)

If molgrid is imported before rdkit

import molgrid
from rdkit import Chem
from rdkit.Chem import AllChem

m = Chem.MolFromSmiles('C1CCC1OC')
m2 = Chem.AddHs(m)
cids = AllChem.EmbedMultipleConfs(m2, numConfs=2)

I get the following failure on the last line (EmbedMultipleConfs call):

TypeError: No to_python (by-value) converter found for C++ type: std::vector<int, std::allocator<int> >

The error does not appear if molgrid is imported after rdkit:

from rdkit import Chem
from rdkit.Chem import AllChem
import molgrid

m = Chem.MolFromSmiles('C1CCC1OC')
m2 = Chem.AddHs(m)
cids = AllChem.EmbedMultipleConfs(m2, numConfs=2)

conda environment to reproduce the issue:

name: rdkit-molgrid
channels:
  - conda-forge
  - pytorch
dependencies:
  - python=3.7
  - ipython
  - pip

  - rdkit=2021.03.3
  - cudatoolkit=11.1
  - pytorch

  - pip:
    - molgrid==0.2.1
dkoes commented 3 years ago

Can you check to see if this is still a problem with 0.5.1?

RMeli commented 3 years ago

By upgrading molgrid with python -m pip install -U molgrid the problem seems to persist. I'll try to re-build the whole conda environment.

I tried to run the same script within a Singularity container where molgrid==0.2.1 and rdkit==2021_03_1 are compiled from source and there is no such problem.

dkoes commented 3 years ago

It may depend on what version of boost-python rdkit was built with.

RMeli commented 3 years ago

Yes, probably a similar problem to #62.

dkoes commented 3 years ago

Not really. #62 is a problem because I am relying on bit-for-bit compatibility with openbabel data structures. The issue here is the way rdkit/boost-python are adding symbols to the python environment. I'm 80% sure the correct fix involves changed rdkit, not molgrid.