facebookresearch / dynabench

Dynamic Adversarial Benchmarking platform
MIT License
24 stars 16 forks source link

Validation interface not using all examples #545

Closed HannahKirk closed 3 years ago

HannahKirk commented 3 years ago

For the hate speech task on Dynabench, our annotators are now validating hateful content, and the ‘validation’ screen has frozen. The error code returns (error: "No examples available (26)", status_code: 500)

image

The json from round 5 shows there are 1022 examples. 42 of these have validations

"validations":[["correct","user","9e10e4e3ac76057e76273f1dd068e7e97fd000ec",{}],["correct","user","0546e88e27cb797be4433f3b588cfc6c72e535a0",{"example_explanation":"Target- gayman","model_explanation":"Type- threatening language"}]]

but 980 of these have no validations: "validations":[], despite Dynabench thinking it has run out of examples to validate.

The task owner console has the following settings:

image (3)

I have investigated some examples which were validated and some which had zero validations, there seems to be no clear differences between these examples.

image

Here is the meta-data json field:

Validatable (i.e. .loc[7] in the above examples): {"model":"https://fhcxpbltv0.execute-api.us-west-1.amazonaws.com/predict?model=ts1621286698-DeBERTa","hate_type":"Derogation"}

Non-Validatable (i.e. .loc[105] in the above examples): {"model":"https://fhcxpbltv0.execute-api.us-west-1.amazonaws.com/predict?model=ts1621286698-DeBERTa","hate_type":"Threatening language"}

Max Bartolo was able to to retrieve the examples with the API filter mode i.e. https://api.dynabench.org/examples/5/5/filtered/0/5/0/4 but not with the get random example mode i.e. https://api.dynabench.org/examples/5/5 which makes me suspect that some check in example model getRandom() might not be working as expected.

When in task owner mode on the validation interface, examples are shown.

image

I'd appreciate help on resolving this issue as it is causing a bottle-neck for our annotators!

Thank you!

TristanThrush commented 3 years ago

546