QA: getRandomContext() only seems to be returning from around 5 possible contexts

maxbartolo commented 2 years ago

Not sure what's happening exactly, but it seems that there are only around 5 possible contexts for QA (round 3) and switching to new contexts or refreshing the page seems to always select one of the same 5.

This definitely wasn't the case (I've checked) early Dec '21 - there should be exactly 10,109 passages in the DB.

TristanThrush commented 2 years ago

Hey max, it looks like the default dynabench option is to return the least used contexts in a random order. This helps all contexts get closer to the same number of examples. This wouldn't be apparent at first, because when you first upload contexts, they will all be used 0 times, so you'll see a bunch of them when you query for them. Luckily, this should be an easy option to change in your mturk create interface. If there are issues, let me know.

The api call that the create interface uses is here: https://github.com/facebookresearch/dynabench/blob/main/frontends/web/src/common/ApiService.js#L296 You can see that the default option is "min".

Here is where that api call interfaces with the backend. There are other options besides "min" that you can choose from: https://github.com/facebookresearch/dynabench/blob/main/api/controllers/contexts.py

maxbartolo commented 2 years ago

Amazing, thanks!

facebookresearch / dynabench

QA: getRandomContext() only seems to be returning from around 5 possible contexts #885