LSX-UniWue / SuperGLEBer

German Language Understanding Evaluation Benchmark @NAACL24
https://supergleber.professor-x.de/
7 stars 1 forks source link

Paws-x task shows lots of warnings #2

Closed chschroeder closed 3 months ago

chschroeder commented 3 months ago

Hi,

when executing a run on the Paws-x task, a lot of warnings are shown:

2024-07-13 19:26:03,190 Warning: An empty Sentence was created! Are there empty strings in your dataset?
2024-07-13 19:26:03,190 Warning: An empty Sentence was created! Are there empty strings in your dataset?
2024-07-13 19:26:03,248 Warning: An empty Sentence was created! Are there empty strings in your dataset?
[...]

The following command has been used:

python src/train.py +model=mymodel +train_args=a100 +task=pawsx

Am I doing something wrong?

janpf commented 3 months ago

Hi,

no you are doing everything correctly. We just kept the original dataset and weirdly there are a lot of empty sentence pairs, e.g.:32035 0.

But we remove those anyways after reading: corpus.filter_empty_sentences().

Best, jan

chschroeder commented 3 months ago

Great, then I am relieved. Thanks for the quick response!