ltgoslo / axolotl24_shared_task

AXOLOTL-24 (Ascertain and eXplain Overhauls of the Lexicon Over Time at LChange'24): a shared task
https://github.com/ltgoslo/axolotl24_shared_task
GNU General Public License v3.0
6 stars 4 forks source link

problems with converting the German test set #2

Open nvanva opened 1 month ago

nvanva commented 1 month ago

Hi! We experience problems with the script that converts the German test set. We couldn't reproduce our results, and after doing lots of debugging found that the outputs of the official conversion script surprise.py depends on the version of Pandas (even the number of examples is different!). To make sure run this:

for version in 1.4.4 2.2.0; do pip install pandas==$version; python surprise.py --dwug_path dwug_de_sense/; mv axolotl.test.surprise.gold.tsv pandas-$version-axolotl.test.surprise.gold.tsv; done
wc *tsv

This may indicate using some buggy-prone calls to pandas. Of course, we can stick to the version from requirements.txt, but better avoid such constructions or check that they do not lead to other undesirable effects.

akutuzov commented 1 month ago

I did not look deep into the issue, but Pandas is famous for fluctuating data loading behavior (especially between major versions, like 1 and 2). That's why we specify the version we use in requirements.txt.