awslabs / dgl-lifesci

Python package for graph neural networks in chemistry and biology
Apache License 2.0
696 stars 144 forks source link

WeaveAtomFeaturizer performance bottleneck issue with BuildFeatureFactory #205

Closed chajath closed 1 year ago

chajath commented 1 year ago

I'm doing a batch featurization and a cpu time profiling revealed that https://github.com/awslabs/dgl-lifesci/blob/26cff32ec0088631400929e18224811ccdfda217/python/dgllife/utils/featurizers.py#L1190 accounts for a significant time bottleneck (>22%)

Looking at the code, it seems rather a low hanging fruit to get it initialized at __init__ time or lazy initialize when the featurizer is first called.

mufeili commented 1 year ago

Nice catch! Could you open a PR to address that?

chajath commented 1 year ago

Opened a PR: https://github.com/awslabs/dgl-lifesci/pull/206 Please take a look!

mufeili commented 1 year ago

I close the issue as the PR has been merged.