Open Kh4L opened 5 months ago
Hi! Thanks for your question. Basic question: Why is it necessary and how does this affect the model performance?
This change allow us to represent all of the possible hybridization and edgebond types.
The goal is to match the smile converter implem in PyG https://github.com/pyg-team/pytorch_geometric/blob/51c57da51d45c19497d393cc678c215eb341b3b4/torch_geometric/utils/smiles.py#L7
Also for backward compatibility, I'd like to use these additional features given some flag argument.
What kind of flag are you thinking of?
Got it --- matching with PyG and making the graph expressive is good, but we also don't want to disturb others who have built models on the current version, right?
Why can't people just use PyG's smiles2graph
and apply it to PCQMv2 data? You can easily pass your favorite smiles2graph
here.
Change enums to match RDKit Atom enums
HybridizationType
https://www.rdkit.org/docs/cppapi/classRDKit_1_1Atom.html#a58e40e30db6b42826243163175cac976
BondType
https://www.rdkit.org/docs/cppapi/classRDKit_1_1Bond.html#a2c93af0aeb3297ee77b6afdc27b68d6f