snap-stanford / ogb

Benchmark datasets, data loaders, and evaluators for graph machine learning
https://ogb.stanford.edu
MIT License
1.89k stars 397 forks source link

Problem on loading "ogbg-moltox21" with virtual node #456

Open pricexu opened 11 months ago

pricexu commented 11 months ago

Hi,

Thanks for providing this very helpful dataset. I got an error on loading the "ogbg-moltox21" dataset with the transform as the virtual node, could you take a look? The error can be reproduced using the following code piece: ` import torch_geometric.transforms as T from ogb.graphproppred import PygGraphPropPredDataset from torch_geometric.loader import DataLoader

name = 'ogbg-moltox21' dataset = PygGraphPropPredDataset(root="data/"+name, name=name, transform=T.VirtualNode()) loader = DataLoader(dataset, batch_size=256, shuffle=True) for data in loader: print(data.x.shape) `

and the error information is: Traceback (most recent call last): File "/home/zhexu3/CausalGraphTransformer/test_loader.py", line 8, in for data in loader: File "/home/zhexu3/ENTER/envs/torch39/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 628, in next data = self._next_data() File "/home/zhexu3/ENTER/envs/torch39/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 671, in _next_data data = self._dataset_fetcher.fetch(index) # may raise StopIteration File "/home/zhexu3/ENTER/envs/torch39/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 58, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/zhexu3/ENTER/envs/torch39/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 58, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/zhexu3/ENTER/envs/torch39/lib/python3.9/site-packages/torch_geometric/data/dataset.py", line 240, in getitem data = data if self.transform is None else self.transform(data) File "/home/zhexu3/ENTER/envs/torch39/lib/python3.9/site-packages/torch_geometric/transforms/virtual_node.py", line 37, in call new_type = edge_type.new_full((num_nodes, ), int(edge_type.max()) + 1) RuntimeError: max(): Expected reduction dim to be specified for input.numel() == 0. Specify the reduction dim with the 'dim' argument.

I am using python 3.9, pyg 2.2.0, and ogb 1.3.6. Let me know if you need more information to locate this error. I appreciate it very much!

pricexu commented 11 months ago

More updates on this error: (1) Seems for ogbg-molpcba, the problem exists too, when the batch_size of the dataloader is ~4000-5000. (2) However, for ogbg-molpcba, when the batch_size of the loader is 256, the problem disappears.

weihua916 commented 10 months ago

Thanks for reporting. Seems like edge_type is empty. Maybe @rusty1s can take a look?