time-series-machine-learning / tsml-repo

Discussion, problems and donations of data hosted at
http://www.timeseriesclassification.com
GNU General Public License v3.0
45 stars 6 forks source link

[ISSUE] CharacterTrajectories dataset failed to load #92

Closed hadifawaz1999 closed 12 months ago

hadifawaz1999 commented 1 year ago

When using the aeon toolkit to load the CharacterTrajectories from timeseriesclassification.com it produces this error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.8/dist-packages/aeon/datasets/_data_loaders.py", line 1285, in load_classification
    return _load_tsc_dataset(
  File "/usr/local/lib/python3.8/dist-packages/aeon/datasets/_data_loaders.py", line 466, in _load_tsc_dataset
    return _load_saved_dataset(
  File "/usr/local/lib/python3.8/dist-packages/aeon/datasets/_data_loaders.py", line 296, in _load_saved_dataset
    X, y, meta_data = load_from_tsfile(abspath, return_meta_data=True)
  File "/usr/local/lib/python3.8/dist-packages/aeon/datasets/_data_loaders.py", line 228, in load_from_tsfile
    data, y, meta_data = _load_data(file, meta_data)
  File "/usr/local/lib/python3.8/dist-packages/aeon/datasets/_data_loaders.py", line 168, in _load_data
    raise IOError(
OSError: Unequal length series, in case 159 meta data specifies all equal 0 but saw 916

For more information about the reason check the following issue on aeon

TonyBagnall commented 12 months ago

I have fixed this, although it necessitated a change of the data. The original data is strangely unequal length but also padded at the start and the beginning. When formatting we decided to unpad it. However, the padding is not the same on every channel for a single case, and this created different length series within a single instance. We do not support that format. So, I have reformatted to keep the original data from UCI, now downloadable from tsc.com like so

from aeon.datasets import load_classification
X, y, meta = load_classification("CharacterTrajectories")
print(meta)