DeepGraphLearning / ConfGF

Implementation of Learning Gradient Fields for Molecular Conformation Generation (ICML 2021).
MIT License
159 stars 35 forks source link

The conformation generated is 2D and how to make sure it is a proper conformer as output? #9

Open mw742 opened 1 year ago

mw742 commented 1 year ago

Hello

Thank you for your work and it is useful for fast generating conformers.

I just have a small technical issue here and want to ask.

So I tried to run the default generating conformers using smiles for any random molecule as input.

I tried to run the template command for benzene "python -u script/gen.py --config_path ./config/qm9_default.yml --generator ConfGF --smiles c1ccccc1“ and "python -u script/gen.py --config_path ./config/qm9_default.yml --generator ConfGFDist --smiles c1ccccc1" but the output rdmol reading from the .pkl files are both 2D and do not contain 3D coordinates.

As first, I though it is benzene and it might be flat.

Then I tried a random example like CCCC, "python -u script/gen.py --config_path ./config/qm9_default.yml --generator ConfGF --smiles CCCC" and the output is also 2D unless I embed it using Rdkit, but if I use Rdkit to embed, then it will change the coordinates of the output and the conformer is actually refine by Rkdit, as shown below:

"(confgf) popzq@GPUSrv:~/桌面/ConfGF-main/Diamond/qm9_default$ python Python 3.7.16 (default, Jan 17 2023, 22:20:44) [GCC 11.2.0] :: Anaconda, Inc. on linux Type "help", "copyright", "credits" or "license" for more information.

import pandas as pd data = pd.read_pickle('ConfGF_CCCC.pkl') print(data) Data(atom_type=[14], bond_edge_index=[2, 26], d_gen=[128, 1], d_recover=[128, 1], edge_index=[2, 128], edge_length=[128, 1], edge_name=[128], edge_order=[128], edge_type=[128], is_bond=[128], num_pos_gen=[1], num_pos_traj=[1], pos=[14, 3], pos_gen=[14, 3], pos_traj=[70000, 3], rdmol=<rdkit.Chem.rdchem.Mol object at 0x7fa28856d8f0>, smiles="CCCC") from rdkit import Chem from rdkit.Chem import rdMolDescriptors rdmol_object = data.rdmol conf = rdmol_object.GetConformer() Traceback (most recent call last): File "", line 1, in ValueError: Bad Conformer Id positions = conf.GetPositions() Traceback (most recent call last): File "", line 1, in NameError: name 'conf' is not defined AllChem.EmbedMolecule(rdmol_object, randomSeed=42) Traceback (most recent call last): File "", line 1, in NameError: name 'AllChem' is not defined from rdkit import AllChem Traceback (most recent call last): File "", line 1, in ImportError: cannot import name 'AllChem' from 'rdkit' (/home/popzq/anaconda3/envs/confgf/lib/python3.7/site-packages/rdkit/init.py) from rdkit.Chem import AllChem AllChem.EmbedMolecule(rdmol_object, randomSeed=42) 0 conf = rdmol_object.GetConformer() positions = conf.GetPositions() if positions.shape[1] == 3: print("The molecule has 3D coordinates.") else: print("The molecule has 2D coordinates.") ... The molecule has 3D coordinates.

"

I was just wondering, is there any way I can get the original model generating 3D output, instead of using Rkdit to generate a conformer? Is that any issue I might miss in running this model?