deepset-ai / FARM

:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.
https://farm.deepset.ai
Apache License 2.0
1.73k stars 247 forks source link

Added assertion for questions with token length > self.max_query_length. #783

Closed ftesser closed 2 years ago

ftesser commented 3 years ago

In Squad, I noticed that no log message or errors are thrown when questions are too long. As my opinion this can cause wrong predictions, so better assert the right condition.

ftesser commented 3 years ago

Adding the assertion causes the fail of two tests (test_question_answering.test_training, test_question_answering.test_save_load) this because the question in the test is too long:

https://github.com/deepset-ai/FARM/pull/783/checks?check_run_id=2694268160#step:6:104

AssertionError: Question <In what country is Normandy located?> has a token length of 7 greater than self.max_query_length(6).

ftesser commented 3 years ago

Commit 5451d55 sets max_query_length=7 to successfully pass the tests.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 21 days if no further activity occurs.