pyg-team / pytorch_geometric

Graph Neural Network Library for PyTorch
https://pyg.org
MIT License
20.54k stars 3.57k forks source link

cannot load the qm9 dataset #9377

Open zhukuixi opened 1 month ago

zhukuixi commented 1 month ago

🐛 Describe the bug

import torch_geometric.datasets as pyg

dataset = pyg.QM9('./qm9')

error message:

Downloading https://deepchemdata.s3-us-west-1.amazonaws.com/datasets/molnet_publish/qm9.zip
Extracting qm9/raw/qm9.zip
Downloading https://ndownloader.figshare.com/files/3195404
---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
[<ipython-input-24-8650a6b3c18b>](https://localhost:8080/#) in <cell line: 4>()
      2 
      3 # Load the QM9 dataset
----> 4 dataset = pyg.QM9('./qm9')

11 frames
[/usr/local/lib/python3.10/dist-packages/torch_geometric/datasets/qm9.py](https://localhost:8080/#) in __init__(self, root, transform, pre_transform, pre_filter)
    117             value, indicating whether the data object should be included in the
    118             final dataset. (default: :obj:`None`)
--> 119         force_reload (bool, optional): Whether to re-process the dataset.
    120             (default: :obj:`False`)
    121 

[/usr/local/lib/python3.10/dist-packages/torch_geometric/data/in_memory_dataset.py](https://localhost:8080/#) in __init__(self, root, transform, pre_transform, pre_filter)
     55             :class:`~torch_geometric.data.Data` or
     56             :class:`~torch_geometric.data.HeteroData` object and returns a
---> 57             boolean value, indicating whether the data object should be
     58             included in the final dataset. (default: :obj:`None`)
     59         log (bool, optional): Whether to print any console output while

[/usr/local/lib/python3.10/dist-packages/torch_geometric/data/dataset.py](https://localhost:8080/#) in __init__(self, root, transform, pre_transform, pre_filter)
     83         raise NotImplementedError
     84 
---> 85     def get(self, idx: int) -> BaseData:
     86         r"""Gets the data object at index :obj:`idx`."""
     87         raise NotImplementedError

[/usr/local/lib/python3.10/dist-packages/torch_geometric/data/dataset.py](https://localhost:8080/#) in _download(self)
    144         Alias for :py:attr:`~num_node_features`.
    145         """
--> 146         return self.num_node_features
    147 
    148     @property

[/usr/local/lib/python3.10/dist-packages/torch_geometric/datasets/qm9.py](https://localhost:8080/#) in download(self)
    154                          force_reload=force_reload)
    155         self.load(self.processed_paths[0])
--> 156 
    157     def mean(self, target: int) -> float:
    158         y = torch.cat([self.get(i).y for i in range(len(self))], dim=0)

[/usr/local/lib/python3.10/dist-packages/torch_geometric/data/download.py](https://localhost:8080/#) in download_url(url, folder, log)
     32         filename = filename if filename[0] == '?' else filename.split('?')[0]
     33 
---> 34     path = osp.join(folder, filename)
     35 
     36     if fs.exists(path):  # pragma: no cover

[/usr/lib/python3.10/urllib/request.py](https://localhost:8080/#) in urlopen(url, data, timeout, cafile, capath, cadefault, context)
    214     else:
    215         opener = _opener
--> 216     return opener.open(url, data, timeout)
    217 
    218 def install_opener(opener):

[/usr/lib/python3.10/urllib/request.py](https://localhost:8080/#) in open(self, fullurl, data, timeout)
    523         for processor in self.process_response.get(protocol, []):
    524             meth = getattr(processor, meth_name)
--> 525             response = meth(req, response)
    526 
    527         return response

[/usr/lib/python3.10/urllib/request.py](https://localhost:8080/#) in http_response(self, request, response)
    632         # request was successfully received, understood, and accepted.
    633         if not (200 <= code < 300):
--> 634             response = self.parent.error(
    635                 'http', request, response, code, msg, hdrs)
    636 

[/usr/lib/python3.10/urllib/request.py](https://localhost:8080/#) in error(self, proto, *args)
    561         if http_err:
    562             args = (dict, 'default', 'http_error_default') + orig_args
--> 563             return self._call_chain(*args)
    564 
    565 # XXX probably also want an abstract factory that knows when it makes

[/usr/lib/python3.10/urllib/request.py](https://localhost:8080/#) in _call_chain(self, chain, kind, meth_name, *args)
    494         for handler in handlers:
    495             func = getattr(handler, meth_name)
--> 496             result = func(*args)
    497             if result is not None:
    498                 return result

[/usr/lib/python3.10/urllib/request.py](https://localhost:8080/#) in http_error_default(self, req, fp, code, msg, hdrs)
    641 class HTTPDefaultErrorHandler(BaseHandler):
    642     def http_error_default(self, req, fp, code, msg, hdrs):
--> 643         raise HTTPError(req.full_url, code, msg, hdrs, fp)
    644 
    645 class HTTPRedirectHandler(BaseHandler):

HTTPError: HTTP Error 403: Forbidden

Versions

Collecting environment information... PyTorch version: 2.3.0+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A

OS: Ubuntu 22.04.3 LTS (x86_64) GCC version: (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 Clang version: 14.0.0-1ubuntu1.1 CMake version: version 3.27.9 Libc version: glibc-2.35

Python version: 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] (64-bit runtime) Python platform: Linux-6.1.85+-x86_64-with-glibc2.35 Is CUDA available: True CUDA runtime version: 12.2.140 CUDA_MODULE_LOADING set to: LAZY GPU models and configuration: GPU 0: Tesla T4 Nvidia driver version: 535.104.05 cuDNN version: Probably one of the following: /usr/lib/x86_64-linux-gnu/libcudnn.so.8.9.6 /usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.9.6 /usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.9.6 /usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.9.6 /usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.9.6 /usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.9.6 /usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.9.6 HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: True

CPU: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 46 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 2 On-line CPU(s) list: 0,1 Vendor ID: GenuineIntel Model name: Intel(R) Xeon(R) CPU @ 2.00GHz CPU family: 6 Model: 85 Thread(s) per core: 2 Core(s) per socket: 1 Socket(s): 1 Stepping: 3 BogoMIPS: 4000.30 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves arat md_clear arch_capabilities Hypervisor vendor: KVM Virtualization type: full L1d cache: 32 KiB (1 instance) L1i cache: 32 KiB (1 instance) L2 cache: 1 MiB (1 instance) L3 cache: 38.5 MiB (1 instance) NUMA node(s): 1 NUMA node0 CPU(s): 0,1 Vulnerability Gather data sampling: Not affected Vulnerability Itlb multihit: Not affected Vulnerability L1tf: Mitigation; PTE Inversion Vulnerability Mds: Vulnerable; SMT Host state unknown Vulnerability Meltdown: Vulnerable Vulnerability Mmio stale data: Vulnerable Vulnerability Reg file data sampling: Not affected Vulnerability Retbleed: Vulnerable Vulnerability Spec rstack overflow: Not affected Vulnerability Spec store bypass: Vulnerable Vulnerability Spectre v1: Vulnerable: __user pointer sanitization and usercopy barriers only; no swapgs barriers Vulnerability Spectre v2: Vulnerable; IBPB: disabled; STIBP: disabled; PBRSB-eIBRS: Not affected; BHI: Vulnerable (Syscall hardening enabled) Vulnerability Srbds: Not affected Vulnerability Tsx async abort: Vulnerable

Versions of relevant libraries: [pip3] numpy==1.25.2 [pip3] torch==2.3.0+cu121 [pip3] torch-geometric==2.6.0 [pip3] torch-scatter==2.1.2 [pip3] torch-sparse==0.6.18 [pip3] torchaudio==2.3.0+cu121 [pip3] torchsummary==1.5.1 [pip3] torchtext==0.18.0 [pip3] torchvision==0.18.0+cu121 [pip3] triton==2.3.0 [conda] Could not collect

rusty1s commented 3 weeks ago

It looks like downloading the files failed in your case. Can you remove the dataset root_dir and try again? It works fine for me.