pyg-team / pytorch_geometric

Graph Neural Network Library for PyTorch
https://pyg.org
MIT License
20.52k stars 3.57k forks source link

Add conversion to and from `RDKit.Molecule` #9452

Closed oiao closed 5 days ago

oiao commented 6 days ago

Currently the utils submodule includes functions that support conversion of SMILES strings <-> Data.

However, there is value in adding that functionality on the level of rdkit.Chem.Mol <-> Data. The motivation for this is that users might be interested in pre-processing the chemical structure prior to conversion. This might include operations such as the removal of ions, chiral centers, or standardization tautomer on the level of the Molecule object.

This PR adds two new functions, from_rdmol/to_rdmol, leaving most of the original functionality in place, and re-defining the conversion from/to SMILES as an extension of the conversion to the Molecule objects. Implicitly, the following conversions are now possible: SMILES <-> rdkit.Chem.Mol <-> Data.

oiao commented 5 days ago

Thanks a lot for the fixes :) Happy to see the pace at which the project matures.

As a side-note: I found it difficult to get started with the PR, as there was not CONTRIBUTING.md or other kind of developer docs - did I perhaps just miss it?

rusty1s commented 5 days ago

See https://github.com/pyg-team/pytorch_geometric/blob/master/.github/CONTRIBUTING.md