openml / openml-data

For tracking issues related to OpenML datasets
1 stars 1 forks source link

Couldn't download dataset #64

Open uEternali opened 3 months ago

uEternali commented 3 months ago

cannot connect to dataset download site "openml1.win.tue.nl:9090"

In [6]: !curl -I api.openml.org
HTTP/1.1 301 Moved Permanently
Connection: close
Content-Type: text/html; charset=iso-8859-1
Date: Mon, 08 Apr 2024 07:24:41 GMT
Location: https://api.openml.org/
Server: Apache/2.4.29 (Ubuntu)

In [7]: dataset = openml.datasets.get_dataset(44313)
WARNING:root:Received uncompressed content from OpenML for https://api.openml.org/data/v1/download/22111025/Meta_Album_DOG_Micro.arff.
WARNING:urllib3.connectionpool:Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7f0a42408cd0>, 'Connection to openml1.win.tue.nl timed out. (connect timeout=300)')': /datasets/0004/44313/dataset_44313.pq
WARNING:urllib3.connectionpool:Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7f0a424088e0>, 'Connection to openml1.win.tue.nl timed out. (connect timeout=300)')': /datasets/0004/44313/dataset_44313.pq
WARNING:urllib3.connectionpool:Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7f0a42408e50>, 'Connection to openml1.win.tue.nl timed out. (connect timeout=300)')': /datasets/0004/44313/dataset_44313.pq
WARNING:urllib3.connectionpool:Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7f0a42439d50>, 'Connection to openml1.win.tue.nl timed out. (connect timeout=300)')': /datasets/0004/44313/dataset_44313.pq
uEternali commented 3 months ago
In [2]: dataset = openml.datasets.get_dataset('44313', download_data=True, download_all_files=True)
<ipython-input-57-ed1b9b7180d4>:1: FutureWarning: Starting from Version 0.15 `download_data`, `download_qualities`, and `download_features_meta_data` will all be ``False`` instead of ``True`` by default to enable lazy loading. To disable this message until version 0.15 explicitly set `download_data`, `download_qualities`, and `download_features_meta_data` to a bool while calling `get_dataset`.
  dataset = openml.datasets.get_dataset('44313', download_data=True, download_all_files=True)
<ipython-input-57-ed1b9b7180d4>:1: FutureWarning: ``download_all_files`` is experimental and is likely to break with new releases.
  dataset = openml.datasets.get_dataset('44313', download_data=True, download_all_files=True)
WARNING:urllib3.connectionpool:Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7f056bfa5780>, 'Connection to openml1.win.tue.nl timed out. (connect timeout=300)')': /datasets?delimiter=&encoding-type=url&list-type=2&max-keys=1000&prefix=0004%2F44313
WARNING:urllib3.connectionpool:Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7f056bfa50f0>, 'Connection to openml1.win.tue.nl timed out. (connect timeout=300)')': /datasets?delimiter=&encoding-type=url&list-type=2&max-keys=1000&prefix=0004%2F44313

failed to download auxiliary files (e.g., meta-album) from openml server.

PGijsbers commented 2 months ago

Hi, sorry for the slow reply. I think something went wrong due to the issue being closed temporarily. Does the problem persist for you? I can not reproduce it.

I do know that in the period you opened that issue we were experiencing server loads heavier than usual. If you can't reproduce it, my guess is that this was a symptom of that. We are looking into how we can improve the resilience of the server in the future.