openml / OpenML

Open Machine Learning
https://openml.org
BSD 3-Clause "New" or "Revised" License
667 stars 90 forks source link

OpenMLError "Dataset with data_id X not found" #1092

Open hildeweerts opened 3 years ago

hildeweerts commented 3 years ago

We are using OpenML datasets in the Fairlearn project, but are running into issues with CI/internal testing due "Dataset with data_id X not found " OpenMLError errors (see also https://github.com/fairlearn/fairlearn/issues/703).

Although it does not happen every time, it seems to be pretty common. Rerunning usually helps but it is not ideal. What is the recommended way to deal with this? I'd like to keep using OpenML for various reasons, but these errors may be a reason to move to a different dataset repository.

joaquinvanschoren commented 3 years ago

The SSL certificate expired while we were waiting for a new one. Hopefully fixed tomorrow.

We’re upgrading the dataset hosting, those transient errors should no longer occur then. For now, please add a small delay between dataset downloads in CI testing.

On Wed, 24 Mar 2021 at 11:58, Hilde Weerts @.***> wrote:

We are using OpenML datasets in the Fairlearn project, but are running into issues with CI/internal testing due "Dataset with data_id X not found " OpenMLError errors (see also fairlearn/fairlearn#703 https://github.com/fairlearn/fairlearn/issues/703).

Although it does not happen every time, it seems to be pretty common. Rerunning usually helps but it is not ideal. What is the recommended way to deal with this? I'd like to keep using OpenML for various reasons, but these errors may be a reason to move to a different dataset repository.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/openml/OpenML/issues/1092, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANFAVYAWL4L3ZOEY3VHV3LTFG77HANCNFSM4ZXACW2Q .

SharmaRupali commented 3 years ago

Are these two issues because of this?

Not being able to access the data from a script. Screenshot from 2021-03-24 21-42-49

And, not being able to see the list of Datasets, or tasks, or any other element. Screenshot from 2021-03-24 21-42-05

joaquinvanschoren commented 3 years ago

Yes, exactly.

joaquinvanschoren commented 3 years ago

The SSL issue is fixed now. Again apologies for the inconvenience!

Please close the issue if everything is working for you again.

hildeweerts commented 3 years ago

Awesome, thanks!

rth commented 3 years ago

I'm still occasionally seeing this issue in CI when downloading https://www.openml.org/d/61 . It also not systematic and only happens from time to time.

Is there any migration work in progress that would explain that?

hildeweerts commented 3 years ago

Hi again! Unfortunately we are again experiencing some issues with retrieving datasets in our CI (see https://github.com/fairlearn/fairlearn/pull/732#issuecomment-824583523).

joaquinvanschoren commented 3 years ago

Hi Hilde, today I worked on some API bottlenecks, I hope this resolves the issue to a large extent. Later this week I will look at load balancing between our servers. Can you give a bit more detail on what you are doing exactly to trigger this error? Then we could do more stress-testing ourselves.

hildeweerts commented 3 years ago

Hi @joaquinvanschoren - that sounds great!

We are using OpenML datasets on a few places in Fairlearn, including some functions to fetch datasets and for examples in the documentation (https://github.com/fairlearn/fairlearn/search?p=1&q=openml). If I recall correctly we mostly have issues when building the documentation in a CircleCi workflow. It usually works again after we rerun it, but you can imagine it can be a bit discouragingto see some of the checks turn red (particularly for first-time contributors who don't know what went wrong).

This is some of the output of a recent failure (originally from https://github.com/fairlearn/fairlearn/pull/808 - but it might be rerun before you read this.)


Extension error:
Here is a summary of the problems encountered when running the examples

Unexpected failing examples:
/tmp/tmpuazo8wqt/675e1646724fc1bfb9758221bf6ec6cc288c1fd9/examples/plot_make_derived_metric.py failed leaving traceback:
Traceback (most recent call last):
  File "/tmp/tmpuazo8wqt/675e1646724fc1bfb9758221bf6ec6cc288c1fd9/examples/plot_make_derived_metric.py", line 50, in <module>
    data = fetch_openml(data_id=1590, as_frame=True)
  File "/usr/local/lib/python3.7/site-packages/sklearn/utils/validation.py", line 63, in inner_f
    return f(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/sklearn/datasets/_openml.py", line 847, in fetch_openml
    data_description = _get_data_description_by_id(data_id, data_home)
  File "/usr/local/lib/python3.7/site-packages/sklearn/datasets/_openml.py", line 438, in _get_data_description_by_id
    url, error_message, data_home=data_home
  File "/usr/local/lib/python3.7/site-packages/sklearn/datasets/_openml.py", line 179, in _get_json_content_from_openml_api
    raise OpenMLError(error_message)
sklearn.datasets._openml.OpenMLError: Dataset with data_id 1590 not found.

/tmp/tmpuazo8wqt/675e1646724fc1bfb9758221bf6ec6cc288c1fd9/examples/plot_quickstart.py failed leaving traceback:
Traceback (most recent call last):
  File "/tmp/tmpuazo8wqt/675e1646724fc1bfb9758221bf6ec6cc288c1fd9/examples/plot_quickstart.py", line 17, in <module>
    data = fetch_openml(data_id=1590, as_frame=True)
  File "/usr/local/lib/python3.7/site-packages/sklearn/utils/validation.py", line 63, in inner_f
    return f(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/sklearn/datasets/_openml.py", line 847, in fetch_openml
    data_description = _get_data_description_by_id(data_id, data_home)
  File "/usr/local/lib/python3.7/site-packages/sklearn/datasets/_openml.py", line 438, in _get_data_description_by_id
    url, error_message, data_home=data_home
  File "/usr/local/lib/python3.7/site-packages/sklearn/datasets/_openml.py", line 179, in _get_json_content_from_openml_api
    raise OpenMLError(error_message)
sklearn.datasets._openml.OpenMLError: Dataset with data_id 1590 not found.

/tmp/tmpuazo8wqt/675e1646724fc1bfb9758221bf6ec6cc288c1fd9/examples/plot_quickstart_selection_rate.py failed leaving traceback:
Traceback (most recent call last):
  File "/tmp/tmpuazo8wqt/675e1646724fc1bfb9758221bf6ec6cc288c1fd9/examples/plot_quickstart_selection_rate.py", line 13, in <module>
    data = fetch_adult(as_frame=True)
  File "/tmp/tmpuazo8wqt/675e1646724fc1bfb9758221bf6ec6cc288c1fd9/docs/../fairlearn/datasets/_fetch_adult.py", line 84, in fetch_adult
    return_X_y=return_X_y,
  File "/usr/local/lib/python3.7/site-packages/sklearn/utils/validation.py", line 63, in inner_f
    return f(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/sklearn/datasets/_openml.py", line 847, in fetch_openml
    data_description = _get_data_description_by_id(data_id, data_home)
  File "/usr/local/lib/python3.7/site-packages/sklearn/datasets/_openml.py", line 438, in _get_data_description_by_id
    url, error_message, data_home=data_home
  File "/usr/local/lib/python3.7/site-packages/sklearn/datasets/_openml.py", line 179, in _get_json_content_from_openml_api
    raise OpenMLError(error_message)
sklearn.datasets._openml.OpenMLError: Dataset with data_id 1590 not found.

-------------------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/bin/sphinx-multiversion", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.7/site-packages/sphinx_multiversion/main.py", line 352, in main
    subprocess.check_call(cmd, cwd=current_cwd, env=env)
  File "/usr/local/lib/python3.7/subprocess.py", line 347, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '('/usr/local/bin/python', '-R', '-m', 'sphinx', '-D', 'smv_metadata_path=/tmp/tmpuazo8wqt/versions.json', '-D', 'smv_current_version=main', '-c', '/home/circleci/tmp-fairlearn/docs', '/tmp/tmpuazo8wqt/675e1646724fc1bfb9758221bf6ec6cc288c1fd9/docs', '/home/circleci/tmp-fairlearn/docs/_build/html/main')' returned non-zero exit status 2.
Traceback (most recent call last):
  File "scripts/build_documentation.py", line 83, in <module>
    main(sys.argv[1:])
  File "scripts/build_documentation.py", line 75, in main
    args.output_path])
  File "/home/circleci/tmp-fairlearn/scripts/_utils.py", line 30, in __exit__
    raise value
  File "scripts/build_documentation.py", line 75, in main
    args.output_path])
  File "/usr/local/lib/python3.7/subprocess.py", line 347, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sphinx-multiversion', 'docs', 'docs/_build/html']' returned non-zero exit status 1.

Exited with code exit status 1
hildeweerts commented 3 years ago

Unfortunately we've recently been having this issue again in our CI (see https://github.com/fairlearn/fairlearn/issues/703). Is there anything going on that could have caused this, or are we dealing again with the generic problem of failing to connect to the server?

joaquinvanschoren commented 3 years ago

Hi Hilde,

Apologies again - we're doing a lot of work in the backend to set up Kubernetes and load balancing. I hope these issues will soon be a thing of the past!