allenai / unifiedqa

UnifiedQA: Crossing Format Boundaries With a Single QA System
https://arxiv.org/abs/2005.00700
Apache License 2.0
426 stars 43 forks source link

Is there a way to get SQuAD V2 style "No Answer"? #45

Closed tingofurro closed 2 years ago

tingofurro commented 2 years ago

As stated, is it likely for the model to generate "no-ans" type text on extractive tasks, or were the unanswerable samples removed during training?

Thank you, Philippe

danyaljj commented 2 years ago

Yes, "no-ans" we encoded as <No Answer> in the training data. See: https://github.com/allenai/unifiedqa/blob/b5439e7411b8bc8a264a9eca55c7508822cd6ae5/encode_datasets.py#L371 In the model output, it usually appears as ⁇ no answer> for whatever reason related to T5 decoder. The demo (https://unifiedqa.apps.allenai.org/) contains an example:

Screen Shot 2022-05-06 at 3 16 45 PM

Unfortunately, we never got a chance to evaluate its ability in distinguishing knowns vs unknowns.