aeye-lab / pymovements

A python package for processing eye movement data
https://pymovements.readthedocs.io
MIT License
61 stars 12 forks source link

load() broken for gazebase in pymovments 0.16.0 #483

Closed SiQube closed 1 year ago

SiQube commented 1 year ago

Current Behavior

using pymovements 0.16.0:

$ pip freeze | grep pymovements
pymovements==0.16.0

tl;dr: I found that load() is broken for gazebase, gazebasevr, judo1000, namely renaming the columns.

using a minimal script t.py to reproduce issue:

import pymovements as pm
dataset = pm.Dataset('GazeBase', path='data/GazeBase')

dataset.download()
dataset.load()

I get the following error message:

$ python t.py 
No download necessary
Load
  0%|                                                                                                                                                                                                  | 0/12334 [00:00<?, ?it/s]
Traceback (most recent call last):
    dataset.load()
  File "/home/sq/.local/lib/python3.9/site-packages/pymovements/dataset/dataset.py", line 124, in load
    self.load_gaze_files(
  File "/home/sq/.local/lib/python3.9/site-packages/pymovements/dataset/dataset.py", line 189, in load_gaze_files
    self.gaze = dataset_files.load_gaze_files(
  File "/home/sq/.local/lib/python3.9/site-packages/pymovements/dataset/dataset_files.py", line 274, in load_gaze_files
    gaze_df = GazeDataFrame(
  File "/home/sq/.local/lib/python3.9/site-packages/pymovements/gaze/gaze_dataframe.py", line 173, in __init__
    self.frame = self.frame.rename({time_column: 'time'})
  File "/home/sq/.local/lib/python3.9/site-packages/polars/dataframe/frame.py", line 3703, in rename
    return self.lazy().rename(mapping).collect(no_optimization=True)
  File "/home/sq/.local/lib/python3.9/site-packages/polars/utils/deprecation.py", line 93, in wrapper
    return function(*args, **kwargs)
  File "/home/sq/.local/lib/python3.9/site-packages/polars/lazyframe/frame.py", line 1695, in collect
    return wrap_df(ldf.collect())
exceptions.SchemaFieldNotFoundError: n

similarly by adjusting the script t.py to

import pymovements as pm
dataset = pm.Dataset('GazeBaseVR', path='data/GazeBaseVR')

dataset.download()
dataset.load()

I get the same error

$ python t.py 
Downloading https://figshare.com/ndownloader/files/38844024 to data/GazeBaseVR/downloads/gazebasevr.zip
gazebasevr.zip: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2.30G/2.30G [00:27<00:00, 89.0MB/s]
Checking integrity of gazebasevr.zip
Extracting gazebasevr.zip to data/GazeBaseVR/raw
  0%|                                                                                                                                                                                                   | 0/5020 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/raid/projects/reich3/t.py", line 5, in <module>
    dataset.load()
  File "/home/sq/.local/lib/python3.9/site-packages/pymovements/dataset/dataset.py", line 124, in load
    self.load_gaze_files(
  File "/home/sq/.local/lib/python3.9/site-packages/pymovements/dataset/dataset.py", line 189, in load_gaze_files
    self.gaze = dataset_files.load_gaze_files(
  File "/home/sq/.local/lib/python3.9/site-packages/pymovements/dataset/dataset_files.py", line 274, in load_gaze_files
    gaze_df = GazeDataFrame(
  File "/home/sq/.local/lib/python3.9/site-packages/pymovements/gaze/gaze_dataframe.py", line 173, in __init__
    self.frame = self.frame.rename({time_column: 'time'})
  File "/home/sq/.local/lib/python3.9/site-packages/polars/dataframe/frame.py", line 3703, in rename
    return self.lazy().rename(mapping).collect(no_optimization=True)
  File "/home/sq/.local/lib/python3.9/site-packages/polars/utils/deprecation.py", line 93, in wrapper
    return function(*args, **kwargs)
  File "/home/sq/.local/lib/python3.9/site-packages/polars/lazyframe/frame.py", line 1695, in collect
    return wrap_df(ldf.collect())
exceptions.SchemaFieldNotFoundError: n

note, here I download the dataset. for gazebase I got the error the first time, but it also happens after a new download.

for judo1000 we get a different error:

using t.py as follows:

import pymovements as pm
dataset = pm.Dataset('JuDo1000', path='data/JuDo1000')

dataset.download()
dataset.load()

we get:

$ python t.py 
Downloading https://osf.io/download/4wy7s/ to data/JuDo1000/downloads/JuDo1000.zip
JuDo1000.zip: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.75G/1.75G [00:15<00:00, 125MB/s]
Checking integrity of JuDo1000.zip
Extracting JuDo1000.zip to data/JuDo1000/raw
  0%|                                                                                                                                                                                                    | 0/600 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/raid/projects/reich3/t.py", line 5, in <module>
    dataset.load()
  File "/home/reich3/.local/lib/python3.9/site-packages/pymovements/dataset/dataset.py", line 124, in load
    self.load_gaze_files(
  File "/home/reich3/.local/lib/python3.9/site-packages/pymovements/dataset/dataset.py", line 189, in load_gaze_files
    self.gaze = dataset_files.load_gaze_files(
  File "/home/reich3/.local/lib/python3.9/site-packages/pymovements/dataset/dataset_files.py", line 274, in load_gaze_files
    gaze_df = GazeDataFrame(
  File "/home/reich3/.local/lib/python3.9/site-packages/pymovements/gaze/gaze_dataframe.py", line 177, in __init__
    _check_component_columns(
  File "/home/reich3/.local/lib/python3.9/site-packages/pymovements/gaze/gaze_dataframe.py", line 444, in _check_component_columns
    raise pl.exceptions.ColumnNotFoundError(
exceptions.ColumnNotFoundError: column x_left from pixel_columns is not available in dataframe

it seems that the renaming is not working properly when installing from pip.

after installing pymovements==0.15.0

$ pip install pymovements==0.15.0

everything works again as justified by this script s.py

import pymovements as pm

for dataset_name in ['JuDo1000', 'GazeBase', 'GazeBaseVR']:
    dataset = pm.Dataset(dataset_name, path=f'data/{dataset_name}')

    dataset.download()
    dataset.load()
$ python s.py 
Using already downloaded and verified file: data/JuDo1000/downloads/JuDo1000.zip
Extracting JuDo1000.zip to data/JuDo1000/raw
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 600/600 [00:08<00:00, 73.05it/s]
Using already downloaded and verified file: data/GazeBase/downloads/GazeBase_v2_0.zip
Extracting GazeBase_v2_0.zip to data/GazeBase/raw
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12334/12334 [01:56<00:00, 105.82it/s]
Using already downloaded and verified file: data/GazeBaseVR/downloads/gazebasevr.zip
Extracting gazebasevr.zip to data/GazeBaseVR/raw
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5020/5020 [00:48<00:00, 103.29it/s]

the toydataset works fine in both versions.

Expected Behavior

dataset is available after loading.

Minimum acceptance criteria

Failure Information (for bugs)

see above

Steps to Reproduce

see above

Context

Please provide any relevant information about your setup. This is important in case the issue is not reproducible except for under certain conditions.

Checklist