vthost commented 1 year ago

_Originally commented in

😵 Describe the installation problem

I am trying to run SSL code (basically pretrain-gnns)

For the newer PyG (i.e., for both above), I just updated: In chem/

In chem/

I guess I am missing an important update. Profiling seems to indicate that the time is spent in

Thank you in advance!


akihironitta commented 1 year ago

Profiling seems to indicate that the time is spent in

Thanks for sharing the finding. Given that the performance gap is huge, I've started trying to reproduce this on my side to catch all possible causes.

Here're some follow-up questions for repro:

Also, when you get a chance, it'd be nice if you could try reducing num_workers and see whether the performance improves.

vthost commented 1 year ago

Thank you for directly getting back to me!

However, I now wanted to create a minimal environment for you to reproduce it. I used this for the PyG 2.2.0 configuration before but not my larger one for PyG 2.3.0. When testing the latter now, it actually worked as fast. So it seems to be another package interfering. I am posting my full environment in the very end below, after the output of the pytorch script. In case you have any idea where it could come from. I'll also check if I find out more.

akihironitta commented 1 year ago

I ran the same script, and I see that 2.3.0 takes 110% time of 2.2.0 with versions of other libraries fixed, but not 500% which you originally posted in the description. I will still try to investigate the performance difference via #7795 to catch both past and future regressions, but I'm closing this issue as you mentioned that 2.3.0 runs as fast as 2.2.0.

vthost commented 1 year ago

Sorry for bothering you but the discussion helped! It seems to be a problem with rdkit. I used the conda version instead of pip's rdkit-pypi. The installation of the former replaces python by cpython, this may be the main problem, but I'll stop here with investigating.