Ennosigaeon / xautoml

XAutoML: A Visual Analytics Tool for Understanding and Validating Automated Machine Learning
BSD 3-Clause "New" or "Revised" License
32 stars 7 forks source link

Can't initialize openml_task #4

Closed SeibertronSS closed 1 year ago

SeibertronSS commented 1 year ago

I have a problem running auto-sklearn.ipynb from the example. I get stuck when running openml_task . The following is the specific traceback

Output exceeds the [size limit](command:workbench.action.openSettings?[). Open the full output data [in a text editor](command:workbench.action.openLargeOutput?bc99aea4-4418-4a49-942e-34f3749dcf92)
---------------------------------------------------------------------------
KeyboardInterrupt                         Traceback (most recent call last)
Cell In [7], line 3
      1 from xautoml.util.datasets import openml_task
----> 3 X_train, y_train = openml_task(31, 0, train=True)
      4 X_train

File ~/miniconda3/envs/kf/lib/python3.9/site-packages/xautoml/util/datasets.py:15, in openml_task(task, fold, train, test)
     12     raise ValueError('Please set either train or test to True')
     14 # noinspection PyTypeChecker
---> 15 task: OpenMLClassificationTask = openml.tasks.get_task(task)
     16 train_indices, test_indices = task.get_train_test_split_indices(fold=fold)
     18 X, y = task.get_X_and_y(dataset_format='dataframe')

File ~/miniconda3/envs/kf/lib/python3.9/site-packages/openml/tasks/functions.py:361, in get_task(task_id, download_data, download_qualities)
    359 try:
    360     task = _get_task_description(task_id)
--> 361     dataset = get_dataset(task.dataset_id, download_data, download_qualities=download_qualities)
    362     # List of class labels availaible in dataset description
    363     # Including class labels as part of task meta data handles
    364     #   the case where data download was initially disabled
    365     if isinstance(task, (OpenMLClassificationTask, OpenMLLearningCurveTask)):

File ~/miniconda3/envs/kf/lib/python3.9/site-packages/openml/datasets/functions.py:432, in get_dataset(dataset_id, download_data, version, error_if_multiple, cache_format, download_qualities)
    430 if "oml:minio_url" in description and download_data:
...
--> 704         return self._sock.recv_into(b)
    705     except timeout:
    706         self._timeout_occurred = True