snap-stanford / ogb

Benchmark datasets, data loaders, and evaluators for graph machine learning
https://ogb.stanford.edu
MIT License
1.92k stars 397 forks source link

Crash when loading ogbn_proteins #483

Open JonasDeSchouwer opened 3 months ago

JonasDeSchouwer commented 3 months ago

I try to execute the following line:

ogb_dataset = NodePropPredDataset(name="ogbn-proteins", root=f"{datasets/data/ogb")

This starts off doing what it is supposed to:

However, as soon as it gets to the line

torch.save({'graph': self.graph, 'labels': self.labels}, pre_processed_file_path, pickle_protocol=4)

in ogb/nodeproppred/dataset.py (= line 135 in the version I am running), the program crashes without any error messages, and only an empty file is saved to datasets/data/ogb/ogbn_proteins/processed/data_processed.

I have been able to reproduce this by just loading self.graph and self.labels in a notebook by executing the following code:

graph = read_csv_graph_raw(raw_dir, add_inverse_edge=True, additional_node_files=['node_species'], additional_edge_files=[])[0]
labels = pd.read_csv(osp.join(raw_dir, 'node-label.csv.gz'), compression='gzip', header=None).values

Then, I can save labels and graph["node_species"] to a file without problem, but as soon as I try to save anything containing graph["edge_index"] or graph["edge_feat"] to a file, the kernel crashes. Note that these have large sizes: (2, 79122504) for graph["edge_index"] and (79122504, 8) for graph["edge_feat"]. All matrices look pretty normal to me, so my guess is that this is a problem with torch.save not being able to handle large files (yet the matrices are smaller than the max size reported in this issue). Yet I thought it will be useful to let you know this and perhaps find a workaround.

--- DETAILS ABOUT MY ENVIRONMENT ---

Output from conda:

_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       2_gnu    conda-forge
absl-py                   2.1.0                    pypi_0    pypi
aiohttp                   3.9.5                    pypi_0    pypi
aiosignal                 1.3.1                    pypi_0    pypi
asttokens                 2.4.1              pyhd8ed1ab_0    conda-forge
attrs                     23.2.0                   pypi_0    pypi
bzip2                     1.0.8                h5eee18b_6  
ca-certificates           2024.3.11            h06a4308_0  
certifi                   2024.2.2                 pypi_0    pypi
charset-normalizer        3.3.2                    pypi_0    pypi
comm                      0.2.2              pyhd8ed1ab_0    conda-forge
debugpy                   1.6.7           py312h6a678d5_0  
decorator                 5.1.1              pyhd8ed1ab_0    conda-forge
exceptiongroup            1.2.0              pyhd8ed1ab_2    conda-forge
executing                 2.0.1              pyhd8ed1ab_0    conda-forge
expat                     2.6.2                h6a678d5_0  
filelock                  3.13.1                   pypi_0    pypi
frozenlist                1.4.1                    pypi_0    pypi
fsspec                    2024.2.0                 pypi_0    pypi
googledrivedownloader     0.4                      pypi_0    pypi
grpcio                    1.64.0                   pypi_0    pypi
idna                      3.7                      pypi_0    pypi
importlib-metadata        7.1.0              pyha770c72_0    conda-forge
importlib_metadata        7.1.0                hd8ed1ab_0    conda-forge
ipykernel                 6.29.3             pyhd33586a_0    conda-forge
ipython                   8.24.0             pyh707e725_0    conda-forge
jedi                      0.19.1             pyhd8ed1ab_0    conda-forge
jinja2                    3.1.3                    pypi_0    pypi
joblib                    1.4.2                    pypi_0    pypi
jupyter_client            8.6.2              pyhd8ed1ab_0    conda-forge
jupyter_core              5.5.0           py312h06a4308_0  
ld_impl_linux-64          2.38                 h1181459_1  
libffi                    3.4.4                h6a678d5_1  
libgcc-ng                 13.2.0               h77fa898_7    conda-forge
libgomp                   13.2.0               h77fa898_7    conda-forge
libsodium                 1.0.18               h36c2ea0_1    conda-forge
libstdcxx-ng              11.2.0               h1234567_1  
libuuid                   1.41.5               h5eee18b_0  
lightning-utilities       0.11.2                   pypi_0    pypi
littleutils               0.2.2                    pypi_0    pypi
markdown                  3.6                      pypi_0    pypi
markupsafe                2.1.5                    pypi_0    pypi
matplotlib-inline         0.1.7              pyhd8ed1ab_0    conda-forge
mpmath                    1.3.0                    pypi_0    pypi
multidict                 6.0.5                    pypi_0    pypi
ncurses                   6.4                  h6a678d5_0  
nest-asyncio              1.6.0              pyhd8ed1ab_0    conda-forge
networkx                  3.2.1                    pypi_0    pypi
numpy                     1.26.3                   pypi_0    pypi
nvidia-cublas-cu12        12.1.3.1                 pypi_0    pypi
nvidia-cuda-cupti-cu12    12.1.105                 pypi_0    pypi
nvidia-cuda-nvrtc-cu12    12.1.105                 pypi_0    pypi
nvidia-cuda-runtime-cu12  12.1.105                 pypi_0    pypi
nvidia-cudnn-cu12         8.9.2.26                 pypi_0    pypi
nvidia-cufft-cu12         11.0.2.54                pypi_0    pypi
nvidia-curand-cu12        10.3.2.106               pypi_0    pypi
nvidia-cusolver-cu12      11.4.5.107               pypi_0    pypi
nvidia-cusparse-cu12      12.1.0.106               pypi_0    pypi
nvidia-nccl-cu12          2.19.3                   pypi_0    pypi
nvidia-nvjitlink-cu12     12.1.105                 pypi_0    pypi
nvidia-nvtx-cu12          12.1.105                 pypi_0    pypi
ogb                       1.3.6                    pypi_0    pypi
openssl                   3.3.0                h4ab18f5_3    conda-forge
outdated                  0.2.2                    pypi_0    pypi
packaging                 24.0               pyhd8ed1ab_0    conda-forge
pandas                    2.2.2                    pypi_0    pypi
parso                     0.8.4              pyhd8ed1ab_0    conda-forge
pexpect                   4.9.0              pyhd8ed1ab_0    conda-forge
pickleshare               0.7.5                   py_1003    conda-forge
pillow                    10.2.0                   pypi_0    pypi
pip                       24.0            py312h06a4308_0  
platformdirs              4.2.2              pyhd8ed1ab_0    conda-forge
prompt-toolkit            3.0.42             pyha770c72_0    conda-forge
protobuf                  5.26.1                   pypi_0    pypi
psutil                    5.9.8                    pypi_0    pypi
ptyprocess                0.7.0              pyhd3deb0d_0    conda-forge
pure_eval                 0.2.2              pyhd8ed1ab_0    conda-forge
pyg-lib                   0.4.0+pt22cu121          pypi_0    pypi
pygments                  2.18.0             pyhd8ed1ab_0    conda-forge
pyparsing                 3.1.2                    pypi_0    pypi
python                    3.12.3               h996f2a0_1  
python-dateutil           2.9.0              pyhd8ed1ab_0    conda-forge
pytorch-lightning         2.2.0                    pypi_0    pypi
pytz                      2024.1                   pypi_0    pypi
pyyaml                    6.0.1                    pypi_0    pypi
pyzmq                     25.1.2          py312h6a678d5_0  
readline                  8.2                  h5eee18b_0  
requests                  2.32.2                   pypi_0    pypi
scikit-learn              1.5.0                    pypi_0    pypi
scipy                     1.13.1                   pypi_0    pypi
setuptools                69.5.1          py312h06a4308_0  
six                       1.16.0             pyh6c4a22f_0    conda-forge
sqlite                    3.45.3               h5eee18b_0  
stack_data                0.6.2              pyhd8ed1ab_0    conda-forge
sympy                     1.12                     pypi_0    pypi
tensorboard               2.16.2                   pypi_0    pypi
tensorboard-data-server   0.7.2                    pypi_0    pypi
tensorboard-reducer       0.3.1                    pypi_0    pypi
threadpoolctl             3.5.0                    pypi_0    pypi
tk                        8.6.14               h39e8969_0  
torch                     2.2.2+cu121              pypi_0    pypi
torch-cluster             1.6.3+pt22cu121          pypi_0    pypi
torch-geometric           2.5.3                    pypi_0    pypi
torch-scatter             2.1.2+pt22cu121          pypi_0    pypi
torch-sparse              0.6.18+pt22cu121          pypi_0    pypi
torch-spline-conv         1.2.2+pt22cu121          pypi_0    pypi
torch-tb-profiler         0.4.3                    pypi_0    pypi
torchaudio                2.2.2+cu121              pypi_0    pypi
torchmetrics              1.4.0.post0              pypi_0    pypi
torchvision               0.17.2+cu121             pypi_0    pypi
tornado                   6.3.3           py312h5eee18b_0  
tqdm                      4.66.4                   pypi_0    pypi
traitlets                 5.14.3             pyhd8ed1ab_0    conda-forge
typing-extensions         4.9.0                    pypi_0    pypi
typing_extensions         4.11.0             pyha770c72_0    conda-forge
tzdata                    2024.1                   pypi_0    pypi
urllib3                   2.2.1                    pypi_0    pypi
wcwidth                   0.2.13             pyhd8ed1ab_0    conda-forge
werkzeug                  3.0.3                    pypi_0    pypi
wheel                     0.43.0          py312h06a4308_0  
xz                        5.4.6                h5eee18b_1  
yarl                      1.9.4                    pypi_0    pypi
zeromq                    4.3.5                h6a678d5_0  
zipp                      3.17.0             pyhd8ed1ab_0    conda-forge
zlib                      1.2.13               h5eee18b_1
JonasDeSchouwer commented 3 months ago

To reproduce this issue:

In the terminal:

conda create -n test_save_env
conda activate test_save_env
conda install python=3.12
pip install ogb==1.3.6

Note that ogb has torch as a dependency, so in my case it installs torch 2.3.1. But I observed the same behaviour with torch 2.2.2+cu121.

Then run the following Python code:

from ogb.io.read_graph_raw import read_csv_graph_raw
import pandas as pd
import os.path as osp
import torch

raw_dir = "datasets/data/ogb/ogbn_proteins/raw"

graph = read_csv_graph_raw(raw_dir, add_inverse_edge=True, additional_node_files=['node_species'], additional_edge_files=[])[0]
labels = pd.read_csv(osp.join(raw_dir, 'node-label.csv.gz'), compression='gzip', header=None).values

In my case, this gives the following error (in a notebook):

The Kernel crashed while executing code in the current cell or a previous cell. 
Please review the code in the cell(s) to identify a possible cause of the failure. 
Click [here](https://aka.ms/vscodeJupyterKernelCrash) for more info. 
View Jupyter [log](command:jupyter.viewOutput) for further details.