Hi! We experience problems with the script that converts the German test set. We couldn't reproduce our results, and after doing lots of debugging found that the outputs of the official conversion script surprise.py depends on the version of Pandas (even the number of examples is different!). To make sure run this:
for version in 1.4.4 2.2.0; do pip install pandas==$version; python surprise.py --dwug_path dwug_de_sense/; mv axolotl.test.surprise.gold.tsv pandas-$version-axolotl.test.surprise.gold.tsv; done
wc *tsv
This may indicate using some buggy-prone calls to pandas. Of course, we can stick to the version from requirements.txt, but better avoid such constructions or check that they do not lead to other undesirable effects.
I did not look deep into the issue, but Pandas is famous for fluctuating data loading behavior (especially between major versions, like 1 and 2).
That's why we specify the version we use in requirements.txt.
Hi! We experience problems with the script that converts the German test set. We couldn't reproduce our results, and after doing lots of debugging found that the outputs of the official conversion script surprise.py depends on the version of Pandas (even the number of examples is different!). To make sure run this:
This may indicate using some buggy-prone calls to pandas. Of course, we can stick to the version from requirements.txt, but better avoid such constructions or check that they do not lead to other undesirable effects.