snap-stanford / ogb

Benchmark datasets, data loaders, and evaluators for graph machine learning
https://ogb.stanford.edu
MIT License
1.89k stars 397 forks source link

ogbn-arxiv too slow to download via PygNodePropPredDataset or directly through the browser #463

Closed gabitoju closed 7 months ago

gabitoju commented 7 months ago

Hi,

Thank you for providing this wonderful datasets and libraries for us to use in our GDL models.

I've been using ogbn-arxiv for a while now but in the last couple if days I've been having problems to download the zip file from Colab and directly from my browser.

When I run this:

dataset = PygNodePropPredDataset(name='ogbn-arxiv', transform=T.ToSparseTensor())

I get this messages:

Will you update the dataset now? (y/N)
N
Downloading http://snap.stanford.edu/ogb/data/nodeproppred/arxiv.zip

  0%|          | 0/81 [00:00<?, ?it/s]

And it just stuck there. The same thing happens when I try to download the zip file via the URL.

I'm using Python 3.10.12 (I believe it's Colab default), OGB 1.3.6 and PyG 2.4.0.

I've also noticed that the server at snap.stanford.edu it's working very slowly and that it take ages to resolve the most basics requests.

Regards

Rhett-Ying commented 7 months ago

I got stuck int downloading >>> dataset = DglNodePropPredDataset("ogbn-arxiv") as well. This issue happened since this Friday. Sometimes it could be downloaded successfully. But it failed in most of the time. image

keitabroadwater commented 7 months ago

I'm having the same issue with 'products'. Within my environment and directly I cannot download the dataset.

I am also getting a strange error below.

Screenshot 2023-11-26 at 8 02 34 PM
sobhanAhmadian commented 7 months ago

I have the same issue I should run the code several times to maybe once it complete

Anamisu commented 7 months ago

I face the same issue.

weihua916 commented 7 months ago

Hi! Apologies for the trouble.

The download speed is normal in my environment (see blow), but it may be slow in yours. If you stop downloading the zipped folder in the middle before it completes the download, you will encounter issue like https://github.com/snap-stanford/ogb/issues/463#issuecomment-1827090522 If so, you will need to manually remove the half-downloaded zipped folder, and rerun the code again until the download completely finishes (please be patient here....). The error can also happen if your disk space is not enough. If so, you will need to increase your disk space so that the file can be fully downloaded.

>>> dataset = PygNodePropPredDataset(name='ogbn-arxiv')
Downloading http://snap.stanford.edu/ogb/data/nodeproppred/arxiv.zip
Downloaded 0.08 GB: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 81/81 [00:12<00:00,  6.28it/s]
Extracting dataset/arxiv.zip
Processing...
Loading necessary files...
This might take a while.
Processing graphs...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 54471.48it/s]
Converting graphs into PyG objects...
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 4116.10it/s]
Saving...
Done!
Rhett-Ying commented 7 months ago

Downloading works well for me now though I didn't manually remove the half-downloaded zipped folder.

Anamisu commented 7 months ago

Works for me as well. I had do remove half-downloaded zipped files.

gabitoju commented 7 months ago

Now it's working fine. I did not remove the zip file and still worked fine. Thank you @weihua916 for the tips!