huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
134.67k stars 26.93k forks source link

Add MistralForQuestionAnswering #28908

Closed nakranivaibhav closed 3 weeks ago

nakranivaibhav commented 9 months ago

Feature request

Add a MistralForQuestionAnswering class to the modeling_mistral.py so Mistral models have AutoModelForQuestionAnswering support (by also adding Mistral models to the MODEL_FOR_QUESTION_ANSWERING_MAPPING_NAMES in the modeling_auto.py file.

Motivation

1 - Evaluation benchmarks like Squad or FaQUAD are commonly used to evaluate language models. 2 - Many decoder-only transformers (BLOOM, Falcon, OpenAI GPT-2, GPT Neo, GPT NeoX, GPT-J, etc.) have support for the AutoModelForQuestionAnswering. 3 - Creating a fine-tuning/evaluation procedure using things like AutoModelForQuestionAnswering and evaluate.load('squad') is very simple, making these features very helpful and desirable. 4 - On the contrary, if one cannot use AutoModelForQuestionAnswering, like in the Llama style models, everything becomes more difficult.

Hence, I would like to request the addition of a MistralForQuestionAnswering class to the modeling_mistral.py file. Hence, we could all easily perform experiments with Mistral models and squad-style Q&A benchmarks:

Your contribution

I have recently added LlamaForQuestionAnswering class in modeling_llama.py file. I can do the same for Mistral.

janpf commented 3 months ago

https://github.com/huggingface/transformers/pull/29168#issuecomment-1965857953

either community pushes for this

push ;) Would be very nice to have an easy way to create a MistralForQuestionAnswering for our benchmark paper.

sriram-1 commented 1 month ago

Thank you for the support janpf. Will be glad if PR is accepted.

LysandreJik commented 1 month ago

Thanks for your request! We'd happily welcome it to the library, feel free to open a PR adding it if you have the bandwidth to do so!

janpf commented 1 month ago

They already did, right? 🤔 #29168 But the PR got stale