norahollenstein / zuco-benchmark

ZuCo Reading Task Classification Benchmark using EEG and Eye-Tracking Data
14 stars 5 forks source link

Run benchmark.py on windows / NR and TSR with test data #3

Closed tkgesis closed 1 month ago

tkgesis commented 1 year ago

Thank you for the recent commit. With the fixed versions, the script can be run partially. Windows filesystems won't handle file names with colons. To run the script on windows machines, one can change datetime.now().strftime("%a, %d %b %Y %H%M%S")) in https://github.com/norahollenstein/zuco-benchmark/blob/33d03062fb4cf88e34dc7fd53cebf7c93d1cf86d/src/data_helpers.py#L20

With the default config, the skript will extract features for the training data, but fails with the items from the heldout set.

OSError: Unable to open file (unable to open file: name = '../data/test/resultsXBB_NR.mat', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)

According to https://osf.io/d7frw/wiki/home/ , the test data contain NR and TSR merged and shuffled, but the code expects two different files, one for each condition.

This could probably be fixed around https://github.com/norahollenstein/zuco-benchmark/blob/33d03062fb4cf88e34dc7fd53cebf7c93d1cf86d/src/benchmark.py#L25

tkgesis commented 1 year ago

When running benchmark_baseline.py with feature_sets = ["fixation_number", "omission_rate", "reading_speed", 'sent_gaze', "mean_sacc_dur", "max_sacc_velocity", "mean_sacc_velocity", "max_sacc_dur", "max_sacc_amp", "mean_sacc_amp", 'sent_saccade', 'sent_gaze_sacc', "theta_mean", "alpha_mean", "beta_mean", "gamma_mean", "eeg_means", "sent_gaze_eeg_means", "electrode_features_theta", "electrode_features_alpha", "electrode_features_beta", "electrode_features_gamma", "electrode_features_all", "electrode_gaze_sacc"]

the variable 'sent' is not assigned for XBB

  File "zuco-benchmark\src\benchmark_baseline.py", line 42, in get_or_extract_features
    fe.extract_sentence_features(subject, f, feature_set, features, "")
  File "zuco-benchmark\src\extract_features.py", line 221, in extract_sentence_features
    weighted_nFix = np.array(af['duration']).shape[0] / len(sent.split())
  UnboundLocalError: local variable 'sent' referenced before assignment

During processing of training data [Y..] , electrode_gaze_sacc IS NOT A VALID FEATURE SET. appears several times.

samuki commented 1 year ago

Thank you for the recent commit. With the fixed versions, the script can be run partially. Windows filesystems won't handle file names with colons. To run the script on windows machines, one can change datetime.now().strftime("%a, %d %b %Y %H%M%S")) in

https://github.com/norahollenstein/zuco-benchmark/blob/33d03062fb4cf88e34dc7fd53cebf7c93d1cf86d/src/data_helpers.py#L20

With the default config, the skript will extract features for the training data, but fails with the items from the heldout set.

OSError: Unable to open file (unable to open file: name = '../data/test/resultsXBB_NR.mat', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)

According to https://osf.io/d7frw/wiki/home/ , the test data contain NR and TSR merged and shuffled, but the code expects two different files, one for each condition.

This could probably be fixed around

https://github.com/norahollenstein/zuco-benchmark/blob/33d03062fb4cf88e34dc7fd53cebf7c93d1cf86d/src/benchmark.py#L25

Hi, thank you for the notification! The paths for the heldout data in benchmark.py should be updated now.

samuki commented 11 months ago

Hello, I'm sorry for the long wait time. You should now have access to all the features outlined in the configuration file. Additionally, we've updated the code for automatically generating submission files for each feature set. These files can be directly submitted to the challenge.