bigscience-workshop / xmtf

Crosslingual Generalization through Multitask Finetuning
https://arxiv.org/abs/2211.01786
Apache License 2.0
507 stars 37 forks source link

Controlled generation #15

Open sh0tcall3r opened 1 year ago

sh0tcall3r commented 1 year ago

Hi! Thanks for the amazing job!

Have a couple of quick questions. I'm trying to use mT0-xxl-mt for QA. When I provide the context and ask a question, subject of which is not presented in the context, the model anyway provide something from the context even if it's totally wrong. Ideal scenario in this case - is if the model could output like 'I cannot answer this question with this context" or something like that. 1) It that possible without hard training on additional data? 2) Bias matter question. If I train the model on additional data, would the model still provide "good" answers when the subject of question is in the context?

Muennighoff commented 1 year ago

Interesting, indeed all of our training data has answers afaik. I don't think we trained on any example where the label is e.g. "Cannot answer given the context". Hence by default the model always tries to answer.

  1. Possibly, e.g. if you tell the model in the prompt "Please answer with 'Cannot answer given the context' if the answer is not included".
  2. If you fine-tune it on a few samples with the "Cannot answer given the context" label and also include some samples where it can answer, I think it would work well and still give good answers.