pyg-team / pytorch_geometric

Graph Neural Network Library for PyTorch
https://pyg.org
MIT License
21.17k stars 3.64k forks source link

Dataloader Slow after Changing Package Versions #7883

Closed vthost closed 1 year ago

vthost commented 1 year ago

_Originally commented in https://github.com/pyg-team/pytorch_geometric/issues/3398#issuecomment-1679409220_

😵 Describe the installation problem

I am trying to run SSL code (basically pretrain-gnns)

For the newer PyG (i.e., for both above), I just updated: In chem/model.py:

In chem/loader.py

I guess I am missing an important update. Profiling seems to indicate that the time is spent in queues.py/resource_sharer.py/connection.py

Thank you in advance!

Environment

akihironitta commented 1 year ago

Profiling seems to indicate that the time is spent in queues.py/resource_sharer.py/connection.py

Thanks for sharing the finding. Given that the performance gap is huge, I've started trying to reproduce this on my side to catch all possible causes.

Here're some follow-up questions for repro:

Also, when you get a chance, it'd be nice if you could try reducing num_workers and see whether the performance improves.

vthost commented 1 year ago

Thank you for directly getting back to me!

However, I now wanted to create a minimal environment for you to reproduce it. I used this for the PyG 2.2.0 configuration before but not my larger one for PyG 2.3.0. When testing the latter now, it actually worked as fast. So it seems to be another package interfering. I am posting my full environment in the very end below, after the output of the pytorch script. In case you have any idea where it could come from. I'll also check if I find out more.

Collecting environment information...
PyTorch version: 1.13.1
Is debug build: False
CUDA used to build PyTorch: 11.7
ROCM used to build PyTorch: N/A

OS: Red Hat Enterprise Linux release 8.8 (Ootpa) (x86_64)
GCC version: (GCC) 8.5.0 20210514 (Red Hat 8.5.0-18)
Clang version: Could not collect
CMake version: version 3.20.2
Libc version: glibc-2.28

Python version: 3.8.16 (default, Mar  2 2023, 03:21:46)  [GCC 11.2.0] (64-bit runtime)
Python platform: Linux-4.18.0-477.15.1.el8_8.x86_64-x86_64-with-glibc2.17
Is CUDA available: True
CUDA runtime version: Could not collect
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: GPU 0: NVIDIA A100-SXM4-40GB
Nvidia driver version: 535.54.03
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              128
On-line CPU(s) list: 0-127
Thread(s) per core:  1
Core(s) per socket:  64
Socket(s):           2
NUMA node(s):        2
Vendor ID:           AuthenticAMD
CPU family:          23
Model:               49
Model name:          AMD EPYC 7742 64-Core Processor
Stepping:            0
CPU MHz:             3292.669
CPU max MHz:         2250.0000
CPU min MHz:         1500.0000
BogoMIPS:            4500.36
Virtualization:      AMD-V
L1d cache:           32K
L1i cache:           32K
L2 cache:            512K
L3 cache:            16384K
NUMA node0 CPU(s):   0-63
NUMA node1 CPU(s):   64-127
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr wbnoinvd amd_ppin arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif v_spec_ctrl umip rdpid overflow_recov succor smca sme sev sev_es

Versions of relevant libraries:
[pip3] mypy-extensions==1.0.0
[pip3] numpy==1.24.3
[pip3] torch==1.13.1
[pip3] torch-geometric==2.3.0
[pip3] torch-scatter==2.1.1+pt113cu117
[pip3] torch-sparse==0.6.17+pt113cu117
[pip3] torch-spline-conv==1.2.2+pt113cu117
[conda] blas                      1.0                         mkl  
[conda] mkl                       2023.1.0         h6d00ec8_46342  
[conda] mkl-service               2.4.0            py38h5eee18b_1  
[conda] mkl_fft                   1.3.6            py38h417a72b_1  
[conda] mkl_random                1.2.2            py38h417a72b_1  
[conda] numpy                     1.24.3           py38hf6e8229_1  
[conda] numpy-base                1.24.3           py38h060ed82_1  
[conda] pyg                       2.3.0           py38_torch_1.13.0_cu117    pyg
[conda] pytorch                   1.13.1          py3.8_cuda11.7_cudnn8.5.0_0    pytorch
[conda] pytorch-cuda              11.7                 h778d358_5    pytorch
[conda] pytorch-mutex             1.0                        cuda    pytorch
[conda] torch-scatter             2.1.1+pt113cu117          pypi_0    pypi
[conda] torch-sparse              0.6.17+pt113cu117          pypi_0    pypi
[conda] torch-spline-conv         1.2.2+pt113cu117          pypi_0    pypi
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                  2_kmp_llvm    conda-forge
_py-xgboost-mutex         2.0                       cpu_0    conda-forge
absl-py                   1.4.0                    pypi_0    pypi
appdirs                   1.4.4              pyhd3eb1b0_0  
array-record              0.2.0                    pypi_0    pypi
astor                     0.8.1                    pypi_0    pypi
astunparse                1.6.3                    pypi_0    pypi
autograd                  1.5                      pypi_0    pypi
autograd-gamma            0.5.0                    pypi_0    pypi
blas                      1.0                         mkl  
boost                     1.78.0           py38h4e30db6_4    conda-forge
boost-cpp                 1.78.0               h6582d0a_3    conda-forge
bottleneck                1.3.5            py38h7deecbd_0  
brotli                    1.0.9                h166bdaf_8    conda-forge
brotli-bin                1.0.9                h166bdaf_8    conda-forge
brotlipy                  0.7.0           py38h27cfd23_1003  
bzip2                     1.0.8                h7f98852_4    conda-forge
ca-certificates           2023.7.22            hbcca054_0    conda-forge
cachetools                5.3.0                    pypi_0    pypi
cairo                     1.16.0            hbbf8b49_1016    conda-forge
cairocffi                 1.5.1                    pypi_0    pypi
cairosvg                  2.5.2                    pypi_0    pypi
certifi                   2023.7.22          pyhd8ed1ab_0    conda-forge
cffi                      1.15.1           py38h5eee18b_3  
charset-normalizer        2.0.4              pyhd3eb1b0_0  
click                     8.1.3                    pypi_0    pypi
cloudpickle               2.2.1                    pypi_0    pypi
colorama                  0.4.6                    pypi_0    pypi
contextlib2               21.6.0                   pypi_0    pypi
contourpy                 1.0.7            py38hfbd4bf9_0    conda-forge
cryptography              39.0.1           py38h9ce1e76_0  
cssselect2                0.7.0                    pypi_0    pypi
cuda-cudart               11.7.99                       0    nvidia
cuda-cupti                11.7.101                      0    nvidia
cuda-libraries            11.7.1                        0    nvidia
cuda-nvrtc                11.7.99                       0    nvidia
cuda-nvtx                 11.7.91                       0    nvidia
cuda-runtime              11.7.1                        0    nvidia
cycler                    0.11.0             pyhd8ed1ab_0    conda-forge
cython                    0.29.35                  pypi_0    pypi
dask                      2022.6.1                 pypi_0    pypi
dataclasses               0.6                      pypi_0    pypi
decorator                 5.1.1                    pypi_0    pypi
defusedxml                0.7.1                    pypi_0    pypi
deprecated                1.2.13                   pypi_0    pypi
descriptastorus           2.6.0                    pypi_0    pypi
dgl                       1.1.1                    py38_0    dglteam
dm-sonnet                 2.0.1                    pypi_0    pypi
dm-tree                   0.1.8                    pypi_0    pypi
docstring-parser          0.15                     pypi_0    pypi
etils                     1.3.0                    pypi_0    pypi
expat                     2.5.0                hcb278e6_1    conda-forge
filelock                  3.12.2                   pypi_0    pypi
flatbuffers               1.12                     pypi_0    pypi
font-ttf-dejavu-sans-mono 2.37                 hab24e00_0    conda-forge
font-ttf-inconsolata      3.000                h77eed37_0    conda-forge
font-ttf-source-code-pro  2.038                h77eed37_0    conda-forge
font-ttf-ubuntu           0.83                 hab24e00_0    conda-forge
fontconfig                2.14.2               h14ed4e7_0    conda-forge
fonts-conda-ecosystem     1                             0    conda-forge
fonts-conda-forge         1                             0    conda-forge
fonttools                 4.39.4           py38h01eb140_0    conda-forge
formulaic                 0.6.1                    pypi_0    pypi
freetype                  2.12.1               hca18f0e_1    conda-forge
fsspec                    2023.5.0                 pypi_0    pypi
future                    0.18.3                   pypi_0    pypi
fuzzywuzzy                0.18.0                   pypi_0    pypi
gast                      0.4.0                    pypi_0    pypi
gettext                   0.21.1               h27087fc_0    conda-forge
gin-config                0.5.0                    pypi_0    pypi
google-api-core           2.11.0                   pypi_0    pypi
google-api-python-client  2.87.0                   pypi_0    pypi
google-auth               2.18.1                   pypi_0    pypi
google-auth-httplib2      0.1.0                    pypi_0    pypi
google-auth-oauthlib      0.4.6                    pypi_0    pypi
google-pasta              0.2.0                    pypi_0    pypi
googleapis-common-protos  1.59.0                   pypi_0    pypi
gpflow                    2.5.2                    pypi_0    pypi
graph-nets                1.1.0                    pypi_0    pypi
graphlib-backport         1.0.3                    pypi_0    pypi
greenlet                  2.0.2            py38h17151c0_1    conda-forge
grpcio                    1.54.2                   pypi_0    pypi
h5py                      3.8.0                    pypi_0    pypi
hdbscan                   0.8.27                   pypi_0    pypi
httplib2                  0.22.0                   pypi_0    pypi
huggingface-hub           0.16.4                   pypi_0    pypi
icu                       72.1                 hcb278e6_0    conda-forge
idna                      3.4              py38h06a4308_0  
importlib-metadata        6.6.0                    pypi_0    pypi
importlib-resources       5.12.0             pyhd8ed1ab_0    conda-forge
importlib_resources       5.12.0             pyhd8ed1ab_0    conda-forge
intel-openmp              2023.1.0         hdb19cb5_46305  
interface-meta            1.3.0                    pypi_0    pypi
jinja2                    3.1.2            py38h06a4308_0  
joblib                    1.1.1            py38h06a4308_0  
kaggle                    1.5.13                   pypi_0    pypi
keras                     2.9.0                    pypi_0    pypi
keras-preprocessing       1.1.2                    pypi_0    pypi
kiwisolver                1.4.4            py38h43d8883_1    conda-forge
lark                      1.1.5                    pypi_0    pypi
lazy-loader               0.2                      pypi_0    pypi
lcms2                     2.15                 haa2dc70_1    conda-forge
ld_impl_linux-64          2.38                 h1181459_1  
lerc                      4.0.0                h27087fc_0    conda-forge
libbrotlicommon           1.0.9                h166bdaf_8    conda-forge
libbrotlidec              1.0.9                h166bdaf_8    conda-forge
libbrotlienc              1.0.9                h166bdaf_8    conda-forge
libclang                  16.0.0                   pypi_0    pypi
libcublas                 11.10.3.66                    0    nvidia
libcufft                  10.7.2.124           h4fbf590_0    nvidia
libcufile                 1.6.1.9                       0    nvidia
libcurand                 10.3.2.106                    0    nvidia
libcusolver               11.4.0.1                      0    nvidia
libcusparse               11.7.4.91                     0    nvidia
libdeflate                1.18                 h0b41bf4_0    conda-forge
libexpat                  2.5.0                hcb278e6_1    conda-forge
libffi                    3.4.4                h6a678d5_0  
libgcc-ng                 12.2.0              h65d4601_19    conda-forge
libgfortran-ng            11.2.0               h00389a5_1  
libgfortran5              11.2.0               h1234567_1  
libglib                   2.76.3               hebfc3b9_0    conda-forge
libiconv                  1.17                 h166bdaf_0    conda-forge
libjpeg-turbo             2.1.5.1              h0b41bf4_0    conda-forge
libnpp                    11.7.4.75                     0    nvidia
libnvjpeg                 11.8.0.2                      0    nvidia
libpng                    1.6.39               h753d276_0    conda-forge
libprotobuf               3.20.3               he621ea3_0  
libstdcxx-ng              12.2.0              h46fd767_19    conda-forge
libtiff                   4.5.0                ha587672_6    conda-forge
libuuid                   2.38.1               h0b41bf4_0    conda-forge
libwebp-base              1.3.0                h0b41bf4_0    conda-forge
libxcb                    1.15                 h0b41bf4_0    conda-forge
libxgboost                1.7.4            cpu_h6e95104_0    conda-forge
libzlib                   1.2.13               h166bdaf_4    conda-forge
lifelines                 0.27.7                   pypi_0    pypi
littleutils               0.2.2                    pypi_0    pypi
llvm-openmp               16.0.4               h4dfa4b3_0    conda-forge
llvmlite                  0.40.0                   pypi_0    pypi
locket                    1.0.0                    pypi_0    pypi
lxml                      4.9.2                    pypi_0    pypi
markdown                  3.4.3                    pypi_0    pypi
markupsafe                2.1.1            py38h7f8727e_0  
matplotlib-base           3.7.1            py38hd6c3c57_0    conda-forge
mkl                       2023.1.0         h6d00ec8_46342  
mkl-service               2.4.0            py38h5eee18b_1  
mkl_fft                   1.3.6            py38h417a72b_1  
mkl_random                1.2.2            py38h417a72b_1  
ml-collections            0.1.1                    pypi_0    pypi
mordredcommunity          2.0.2              pyhd8ed1ab_0    conda-forge
multipledispatch          0.6.0                    pypi_0    pypi
munkres                   1.1.4              pyh9f0ad1d_0    conda-forge
mypy-extensions           1.0.0                    pypi_0    pypi
ncurses                   6.4                  h6a678d5_0  
networkx                  1.8.1                    pypi_0    pypi
ngboost                   0.3.12                   pypi_0    pypi
numba                     0.57.0                   pypi_0    pypi
numexpr                   2.8.4            py38hc78ab66_1  
numpy                     1.24.3           py38hf6e8229_1  
numpy-base                1.24.3           py38h060ed82_1  
oauth2client              4.1.3                    pypi_0    pypi
oauthlib                  3.2.2                    pypi_0    pypi
ogb                       1.3.6                    pypi_0    pypi
opencv-python-headless    4.7.0.72                 pypi_0    pypi
openjpeg                  2.5.0                hfec8fc6_2    conda-forge
openssl                   1.1.1v               hd590300_0    conda-forge
opt-einsum                3.3.0                    pypi_0    pypi
outdated                  0.2.2                    pypi_0    pypi
packaging                 23.0             py38h06a4308_0  
pandas                    1.4.2            py38h295c915_0  
pandas-flavor             0.5.0                    pypi_0    pypi
partd                     1.4.0                    pypi_0    pypi
pcre2                     10.40                hc3806b6_0    conda-forge
pillow                    9.5.0            py38h885162f_1    conda-forge
pip                       23.0.1           py38h06a4308_0  
pixman                    0.40.0               h36c2ea0_0    conda-forge
pooch                     1.4.0              pyhd3eb1b0_0  
portalocker               2.7.0                    pypi_0    pypi
promise                   2.3                      pypi_0    pypi
protobuf                  3.19.6                   pypi_0    pypi
psutil                    5.9.0            py38h5eee18b_0  
pthread-stubs             0.4               h36c2ea0_1001    conda-forge
py-cpuinfo                9.0.0                    pypi_0    pypi
py-xgboost                1.7.4           cpu_py38h66f0ec1_0    conda-forge
pyasn1                    0.5.0                    pypi_0    pypi
pyasn1-modules            0.3.0                    pypi_0    pypi
pycairo                   1.23.0           py38h190342e_0    conda-forge
pycocotools               2.0.6                    pypi_0    pypi
pycparser                 2.21               pyhd3eb1b0_0  
pyg                       2.3.0           py38_torch_1.13.0_cu117    pyg
pygcl                     0.1.2                    pypi_0    pypi
pynndescent               0.5.10                   pypi_0    pypi
pyopenssl                 23.0.0           py38h06a4308_0  
pyparsing                 3.0.9            py38h06a4308_0  
pysocks                   1.7.1            py38h06a4308_0  
pytdc                     0.4.1                    pypi_0    pypi
python                    3.8.16               h7a1cb2a_3  
python-dateutil           2.8.2              pyhd3eb1b0_0  
python-slugify            8.0.1                    pypi_0    pypi
python_abi                3.8                      2_cp38    conda-forge
pytorch                   1.13.1          py3.8_cuda11.7_cudnn8.5.0_0    pytorch
pytorch-cuda              11.7                 h778d358_5    pytorch
pytorch-mutex             1.0                        cuda    pytorch
pytz                      2022.7           py38h06a4308_0  
pyyaml                    6.0                      pypi_0    pypi
rdkit                     2023.3.1                 pypi_0    pypi
rdkit-pypi                2022.9.5                 pypi_0    pypi
readline                  8.2                  h5eee18b_0  
regex                     2023.5.5                 pypi_0    pypi
reportlab                 3.6.13           py38h57c54bf_0    conda-forge
requests                  2.29.0           py38h06a4308_0  
requests-oauthlib         1.3.1                    pypi_0    pypi
rsa                       4.9                      pypi_0    pypi
sacrebleu                 2.3.1                    pypi_0    pypi
scikit-learn              1.2.2            py38h6a678d5_1  
scikit-multilearn         0.2.0                    pypi_0    pypi
scipy                     1.10.1           py38hf6e8229_1  
seaborn                   0.11.2                   pypi_0    pypi
sentencepiece             0.1.99                   pypi_0    pypi
seqeval                   1.2.2                    pypi_0    pypi
setuptools                66.0.0           py38h06a4308_0  
six                       1.16.0             pyhd3eb1b0_1  
sonnet                    0.1.6                    pypi_0    pypi
sqlalchemy                1.4.46           py38h1de0b5d_0    conda-forge
sqlite                    3.41.2               h5eee18b_0  
tabulate                  0.9.0                    pypi_0    pypi
tbb                       2021.8.0             hdb19cb5_0  
tensorboard               2.9.1                    pypi_0    pypi
tensorboard-data-server   0.6.1                    pypi_0    pypi
tensorboard-plugin-wit    1.8.1                    pypi_0    pypi
tensorboardx              2.2                pyhd3eb1b0_0  
tensorflow                2.9.0                    pypi_0    pypi
tensorflow-addons         0.20.0                   pypi_0    pypi
tensorflow-datasets       4.9.0                    pypi_0    pypi
tensorflow-estimator      2.9.0                    pypi_0    pypi
tensorflow-hub            0.13.0                   pypi_0    pypi
tensorflow-io-gcs-filesystem 0.32.0                   pypi_0    pypi
tensorflow-metadata       1.13.0                   pypi_0    pypi
tensorflow-model-optimization 0.7.4                    pypi_0    pypi
tensorflow-probability    0.17.0                   pypi_0    pypi
tensorflow-text           2.9.0                    pypi_0    pypi
termcolor                 2.3.0                    pypi_0    pypi
text-unidecode            1.3                      pypi_0    pypi
tf-models-official        2.7.1                    pypi_0    pypi
tf-slim                   1.1.0                    pypi_0    pypi
threadpoolctl             2.2.0              pyh0d69192_0  
tinycss2                  1.2.1                    pypi_0    pypi
tk                        8.6.12               h1ccaba5_0  
toml                      0.10.2                   pypi_0    pypi
toolz                     0.12.0                   pypi_0    pypi
torch-scatter             2.1.1+pt113cu117          pypi_0    pypi
torch-sparse              0.6.17+pt113cu117          pypi_0    pypi
torch-spline-conv         1.2.2+pt113cu117          pypi_0    pypi
tqdm                      4.65.0           py38hb070fc8_0  
typed-argument-parser     1.8.0                    pypi_0    pypi
typeguard                 2.13.3                   pypi_0    pypi
typing-inspect            0.9.0                    pypi_0    pypi
typing_extensions         4.5.0            py38h06a4308_0  
umap-learn                0.5.1                    pypi_0    pypi
unicodedata2              15.0.0           py38h0a891b7_0    conda-forge
uritemplate               4.1.1                    pypi_0    pypi
urllib3                   1.26.15          py38h06a4308_0  
webencodings              0.5.1                    pypi_0    pypi
werkzeug                  2.3.4                    pypi_0    pypi
wheel                     0.38.4           py38h06a4308_0  
wrapt                     1.15.0                   pypi_0    pypi
xarray                    2023.1.0                 pypi_0    pypi
xgboost                   1.7.4           cpu_py38h66f0ec1_0    conda-forge
xorg-kbproto              1.0.7             h7f98852_1002    conda-forge
xorg-libice               1.0.10               h7f98852_0    conda-forge
xorg-libsm                1.2.3             hd9c2040_1000    conda-forge
xorg-libx11               1.8.4                h8ee46fc_1    conda-forge
xorg-libxau               1.0.11               hd590300_0    conda-forge
xorg-libxdmcp             1.1.3                h7f98852_0    conda-forge
xorg-libxext              1.3.4                h0b41bf4_2    conda-forge
xorg-libxrender           0.9.10            h7f98852_1003    conda-forge
xorg-renderproto          0.11.1            h7f98852_1002    conda-forge
xorg-xextproto            7.3.0             h0b41bf4_1003    conda-forge
xorg-xproto               7.0.31            h7f98852_1007    conda-forge
xz                        5.4.2                h5eee18b_0  
zipp                      3.15.0             pyhd8ed1ab_0    conda-forge
zlib                      1.2.13               h166bdaf_4    conda-forge
zstd                      1.5.2                h3eb15da_6    conda-forge
akihironitta commented 1 year ago

I ran the same script, and I see that 2.3.0 takes 110% time of 2.2.0 with versions of other libraries fixed, but not 500% which you originally posted in the description. I will still try to investigate the performance difference via #7795 to catch both past and future regressions, but I'm closing this issue as you mentioned that 2.3.0 runs as fast as 2.2.0.

vthost commented 1 year ago

Sorry for bothering you but the discussion helped! It seems to be a problem with rdkit. I used the conda version instead of pip's rdkit-pypi. The installation of the former replaces python by cpython, this may be the main problem, but I'll stop here with investigating.