deepset-ai / FARM

:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.
https://farm.deepset.ai
Apache License 2.0
1.74k stars 247 forks source link

GermanBert for Question Answering #475

Closed luke4u closed 4 years ago

luke4u commented 4 years ago

Question Is there any pre-trained QA model for German? Not sure if there is a machine-translated SQuAD? If not, can you share a guideline to fine-tune GermanBert on TyDI or XQuAD or MLQA?

Additional context I know there are a few multi-lingual bert model, but I believe German-trained Bert would perform better. There seems no pretrained German QA bert model available. So would be really good to have one.

Timoeller commented 4 years ago

Hey @luke4u we have experimented with auto translating QA datasets to German without much success.

But we have trained an XLM-r large on SQuAD v2 and evaluated it on the German parts of XQuAD and MLQA. I believe TyDI doesnt have German included, right? We uploaded it to the hf modelhub and added the model card PR Maybe you can use this model for your use case? We evaluated it on some other German domain QA data and found it to be OK (but not great).

We are also working on a proper German QA dataset, though this will take some more time. Please stay tuned.

luke4u commented 4 years ago

Thank you @Timoeller . Very helpful with my questions.

Sadly to confirm there is no German in TyDI.

I have checked the model card on hf. Great work! I will try that and meanwhile looking forward to your progress.

Thanks again.