huggingface / model-evaluator

Evaluate Transformers from the Hub 🔥
https://huggingface.co/spaces/autoevaluate/model-evaluator
Apache License 2.0
13 stars 7 forks source link

Extend zero-shot classification language task to support a generalized likelihood task #64

Open mathemakitten opened 2 years ago

mathemakitten commented 2 years ago

Add a follow-up version of the zero-shot classification task where instead of comparing endings like concat(PROMPT, ENDING1) vs. concat(PROMPT, ENDING2) it's a generalized likelihood task where you specify a prompt w an arbitrary number of placeholders like so:

the woman walked into a {MASK} and started {MASK} afterward.

and there's a column which takes a list of substitutions in order (as a list of lists, where each sublist contains a set of substitutions), e.g. [["cafe", "to order a drink"]], ["forest", "frolicking"], ["tree", "hurting"]

To compare the likelihoods of: the woman walked into a cafe and started to order a drink afterward the woman walked into a forest and started frolicking afterward ...

This extension would enable a lot more tasks to be done zero-shot (e.g. any coreference task, winograd schema type tasks).