datamol-io / graphium

Graphium: Scaling molecular GNNs to infinity.
https://graphium-docs.datamol.io/
Apache License 2.0
197 stars 12 forks source link

Added and integrated C++ graphium_cpp library, a Python module implem… #510

Closed DomInvivo closed 3 months ago

DomInvivo commented 5 months ago

Implented in C++ for featurization and preprocessing optimizations, along with a few other optimizations, significantly reducing memory usage, disk usage, and processing time for large datasets.

Changelogs


Authors: Most changes from @ndickson-nvidia , with some minor adjustment from @DomInvivo

discussion related to that PR

That PR will allow Graphium to perform much much faster, and unlock a new usage of positional encodings since they won't be a bottleneck anymore. Smiles -> pyg graph + pos encodings will now be done directly during dataloading.