snap-stanford / ogb

Benchmark datasets, data loaders, and evaluators for graph machine learning
https://ogb.stanford.edu
MIT License
1.89k stars 397 forks source link

Can't load ogbn-arxiv dataset #423

Closed haoyuhan1 closed 1 year ago

haoyuhan1 commented 1 year ago

Hi,

I reinstalled my environment and found I can't load the ogbn-arxiv dataset. Could you please help me solve this issue?

Python: 3.9 Torch: 2.0.0 with cuda 11.8 Pytorch-Geometric: 2.3.0 ogb: 1.3.5

dataset = PygNodePropPredDataset(name='ogbn-arxiv')
Processing...
AttributeError                            Traceback (most recent call last)
Cell In[2], line 1
----> 1 dataset = PygNodePropPredDataset(name='ogbn-arxiv')

File /lib/python3.9/site-packages/ogb/nodeproppred/dataset_pyg.py:68, in PygNodePropPredDataset.__init__(self, name, root, transform, pre_transform, meta_dict)
     65 self.is_hetero = self.meta_info['is hetero'] == 'True'
     66 self.binary = self.meta_info['binary'] == 'True'
---> 68 super(PygNodePropPredDataset, self).__init__(self.root, transform, pre_transform)
     69 self.data, self.slices = torch.load(self.processed_paths[0])

File /lib/python3.9/site-packages/torch_geometric/data/in_memory_dataset.py:57, in InMemoryDataset.__init__(self, root, transform, pre_transform, pre_filter, log)
     49 def __init__(
     50     self,
     51     root: Optional[str] = None,
   (...)
     55     log: bool = True,
     56 ):
---> 57     super().__init__(root, transform, pre_transform, pre_filter, log)
     58     self._data = None
     59     self.slices = None

File lib/python3.9/site-packages/torch_geometric/data/dataset.py:97, in Dataset.__init__(self, root, transform, pre_transform, pre_filter, log)
     94     self._download()
     96 if self.has_process:
---> 97     self._process()

File lib/python3.9/site-packages/torch_geometric/data/dataset.py:230, in Dataset._process(self)
    227     print('Processing...', file=sys.stderr)
    229 makedirs(self.processed_dir)
--> 230 self.process()
    232 path = osp.join(self.processed_dir, 'pre_transform.pt')
    233 torch.save(_repr(self.pre_transform), path)

File lib/python3.9/site-packages/ogb/nodeproppred/dataset_pyg.py:147, in PygNodePropPredDataset.process(self)
    145     additional_edge_files = []
    146 else:
--> 147     additional_edge_files = self.meta_info['additional edge files'].split(',')
    149 if self.is_hetero:
    150     data = read_heterograph_pyg(self.raw_dir, add_inverse_edge = add_inverse_edge, additional_node_files = additional_node_files, additional_edge_files = additional_edge_files, binary=self.binary)[0]

AttributeError: 'float' object has no attribute 'split'
weihua916 commented 1 year ago

Hi! Thanks for reporting. The issue was due to pandas 2.0 update. Can you update ogb package to 1.3.6 and try again?

haoyuhan1 commented 1 year ago

It works! Thank you Weihua.

Best, Haoyu

annawu2504 commented 2 months ago

Hi! Thanks for reporting. The issue was due to pandas 2.0 update. Can you update ogb package to 1.3.6 and try again?

thank u sososososo much you are saving my life buddy!!!!!!!!!