openml / openml-data

For tracking issues related to OpenML datasets
1 stars 1 forks source link

FOREX_eurrub-hour-Close fails to load #23

Open mitar opened 4 years ago

mitar commented 4 years ago

Trying to load this dataset using Python API openml==0.10.1, I get the following error:

  File ".../site-packages/openml/datasets/dataset.py", line 574, in get_data
    data, categorical, attribute_names = self._load_data()
  File ".../site-packages/openml/datasets/dataset.py", line 438, in _load_data
    self.data_pickle_file = self._create_pickle_in_cache(self.data_file)
  File ".../site-packages/openml/datasets/dataset.py", line 421, in _create_pickle_in_cache
    X, categorical, attribute_names = self._parse_data_from_arff(data_file)
  File ".../site-packages/openml/datasets/dataset.py", line 314, in _parse_data_from_arff
    data = self._get_arff(self.format)
  File ".../site-packages/openml/datasets/dataset.py", line 293, in _get_arff
    return decode_arff(fh)
  File ".../site-packages/openml/datasets/dataset.py", line 286, in decode_arff
    return_type=return_type)
  File ".../site-packages/arff.py", line 895, in decode
    raise e
  File ".../site-packages/arff.py", line 892, in decode
    matrix_type=return_type)
  File ".../site-packages/arff.py", line 822, in _decode
    attr = self._decode_attribute(row)
  File ".../site-packages/arff.py", line 764, in _decode_attribute
    raise BadAttributeType()
arff.BadAttributeType: Bad @ATTRIBUTE type, at line 2.

Python 3.6 on Linux.

mitar commented 4 years ago

I think this one is the same: https://www.openml.org/d/1414

I suspect it is because of the date column type. Python API does not provide a way to read such datasets?

amueller commented 4 years ago

indeed. I think we'd be happy to take a PR.

mitar commented 4 years ago

What should be it converted to in Pandas DataFrame? A string column? Or something already parsed?

mfeurer commented 4 years ago

There's a stalled PR to add the date column: https://github.com/renatopp/liac-arff/pull/67