Closed jxmorris12 closed 2 years ago
datasets
via this GH Issue (which I saw you had some comments on from 18 months ago, haha).datasets
is structured similarly to quora
/('glue','qqp')
. The ParaphraseDatasetElmo
framework should just require me to add a few additional lines to allow for MRPC in retrofitting.I think the things we added are sufficient at this point
This is Table 2 from Retrofitting Contextualized Word Embeddings with Paraphrases.
We're missing a couple of the row dimensions (new settings for retrofitting):
And most of the column dimensions, various fine-tuning tasks:
We certainly don't need to support every scenario, but more would be nice. They claim retrofitting with Sampled Quora gives a ~4% boost in accuracy on MPQA so that could be a good place to start.