snap-stanford / ogb

Benchmark datasets, data loaders, and evaluators for graph machine learning
https://ogb.stanford.edu
MIT License
1.93k stars 397 forks source link

ogbn-papers100M Memory error #229

Closed adhithadias closed 3 years ago

adhithadias commented 3 years ago

I am getting the below error when I try to extract the ogbn-papers100M dataset; below is the code I am executing;

# Load Node Property Prediction datasets in OGB
import numpy as np
from ogb.nodeproppred import NodePropPredDataset

def print_dataset_details(dataset):
    print(dataset.name)
    print('-------------------')
    print(dataset)
    print(dataset.graph.keys())
    for key in dataset.graph.keys():
        print(key, end=' ')
        if dataset.graph[key] is not None:
            if isinstance(dataset.graph[key], (np.ndarray, np.generic) ):
                print(dataset.graph[key].shape)
            else:
                print(dataset.graph[key])
        # print(dataset.graph[key])

dataset_save_location = '/local/scratch/a/user/dataset/'
datasets = ['ogbn-arxiv', 'ogbn-proteins', 'ogbn-mag', 'ogbn-products', 'ogbn-papers100M']

for dataset_name in datasets:
    print("dataset", dataset_name)
    dataset = NodePropPredDataset(name=dataset_name, root = dataset_save_location)
    print_dataset_details(dataset)

Below is the error I am getting;

dataset ogbn-papers100M
This will download 56.17GB. Will you proceed? (y/N)
y
Using exist file papers100M-bin.zip
Extracting /local/scratch/a/kadhitha/dataset/papers100M-bin.zip
Loading necessary files...
This might take a while.
Processing graphs...
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 16131.94it/s]
Saving...
Traceback (most recent call last):
  File "node_prop_pred_data.py", line 24, in <module>
    dataset = NodePropPredDataset(name=dataset_name, root = dataset_save_location)
  File "/home/min/a/user/data/.venv/lib/python3.6/site-packages/ogb/nodeproppred/dataset.py", line 63, in __init__
    self.pre_process()
  File "/home/min/a/user/data/.venv/lib/python3.6/site-packages/ogb/nodeproppred/dataset.py", line 139, in pre_process
    torch.save({'graph': self.graph, 'labels': self.labels}, pre_processed_file_path, pickle_protocol=4)
  File "/home/min/a/user/data/.venv/lib/python3.6/site-packages/torch/serialization.py", line 372, in save
    _save(obj, opened_zipfile, pickle_module, pickle_protocol)
  File "/home/min/a/user/data/.venv/lib/python3.6/site-packages/torch/serialization.py", line 476, in _save
    pickler.dump(obj)
MemoryError

Can this be solved by setting any configuration in the code?

weihua916 commented 3 years ago

This is due to the limited CPU memory. How much CPU memory do you have?

adhithadias commented 3 years ago

Here are the stats of the machine that I am running it. Please have a look. Below is the CPU stat.

$ lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              128
On-line CPU(s) list: 0-127
Thread(s) per core:  2
Core(s) per socket:  64
Socket(s):           1
NUMA node(s):        1
Vendor ID:           AuthenticAMD
CPU family:          23
Model:               49
Model name:          AMD Ryzen Threadripper 3990X 64-Core Processor
Stepping:            0
CPU MHz:             2198.917
CPU max MHz:         2900.0000
CPU min MHz:         2200.0000
BogoMIPS:            5800.34
Virtualization:      AMD-V
L1d cache:           32K
L1i cache:           32K
L2 cache:            512K
L3 cache:            16384K
NUMA node0 CPU(s):   0-127
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate sme ssbd ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif umip rdpid overflow_recov succor smca

Below is the RAM stat;

$ free -g
              total        used        free      shared  buff/cache   available
Mem:            251           2         239           0           9         247
Swap:            39           0          39

The machine has 256GB of RAM with almost as high as 240GB free RAM.

Below is the storage disk stats that I am giving as the root to save the files;

Filesystem                        1G-blocks   Used Available Use% Mounted on
/dev/mapper/dvg-lsa                   1863G   128G     1735G   7% /local/scratch/a
weihua916 commented 3 years ago

Thanks for the info! Hm 240GB should be enough

weihua916 commented 3 years ago

Follow-up on this: I have tried the following, and it worked fine on my end.

from ogb.nodeproppred import NodePropPredDataset
dataset = NodePropPredDataset(name='ogbn-papers100M')

My Python version is 3.7.4, and torch version is 1.7.1.