support languages other than English

zolekode commented 4 years ago

Hi @patil-suraj your library is really amazing and i would like to contribute. Any tips on how to train say for example in other languages ?

patil-suraj commented 4 years ago

Thank you @zolekode .

For training in other languages we'll need a pre-trained model and dataset in that language. I'm not sure if there are pre-trained seq-to-seq models available for other languages right now.

Without a pre-trained model the quality of the questions won't be as good as it is now.

zolekode commented 4 years ago

@patil-suraj makes sense. Thanks alot. I will look into that

hunkim commented 4 years ago

@patil-suraj We do have KorGPT, https://github.com/SKT-AI/KoGPT2 and https://korquad.github.io/. In this case, could you help to generate QAs for Kor?

patil-suraj commented 4 years ago

Hi @hunkim , I do want to add support for GPT-2 in a week or two. Do you think you can upload this model on HF model hub so that it can be easily integrated here ?

As GPT-2 is not a seq-to-seq model, we'll need a different way to do QG with GPT-2. One way we can do answer aware QG with GPT-2 is prepare our input like this Our context is 42 is the answer to life, the universe and everything, answer is 42 and target question is What is the answer to life, universe and everything ?

Then input text: context: <hl> 42 <hl> is the answer to life, the universe and everything. question: What is the answer to life, universe and everything ?

and prepare the attention mask such that, there will be no attention from question: ... part, so model won't look into future tokens and calculate loss only on the question: ... part. And it inference time we will feed only the context part and ask the model to generate the question.

Feel free to take a stab.

epetros commented 4 years ago

Thank you @zolekode .

For training in other languages we'll need a pre-trained model and dataset in that language. I'm not sure if there are pre-trained seq-to-seq models available for other languages right now.

Without a pre-trained model the quality of the questions won't be as good as it is now.

Thanks for the library. Having the dataset (wikipedia/news articles) in another language, how can I go about fine tuning t5 to it? Thanks!

Neuronys commented 4 years ago

Hi @patil-suraj Thanks a lot for sharing this code. It's awesome. As mT5 has been released a few weeks ago by Google and as Huggingface is working to add it into their library, do you plan to upgrade your repo with multi-lingual support ? Cheers Philippe

patil-suraj commented 3 years ago

@hunkim , @ghost, @zolekode , @epetros, @overnightJustifier

mT5 has just been added to transformers, so now you can use it to fine-tune on your own language if mT5 supports it. See https://discuss.huggingface.co/t/mt5-t5v1-1-fine-tuning-results/2098 I haven't tested this codebase against transformers master so things might break.

It would be awesome if you guys could send PR to add support for mT5 :)

zhoudoufu commented 3 years ago

@hunkim , @ghost, @zolekode , @epetros, @overnightJustifier

mT5 has just been added to transformers, so now you can use it to fine-tune on your own language if mT5 supports it. See https://discuss.huggingface.co/t/mt5-t5v1-1-fine-tuning-results/2098 I haven't tested this codebase against transformers master so things might break.

It would be awesome if you guys could send PR to add support for mT5 :)

Hi @patil-suraj, I managed to finetune with MT5 on Chinese based on your script which gives valid output, but I still wondering where is the ceiling of the performance. I suppose mT5 won't surpass T5, my loss is around 0.2, how much is your best performance with finetuned mdl with T5 ?

hunkim commented 3 years ago

@patil-suraj wonderful news!

@zhoudoufu Could you share the Chinese version? I will modify it for a Korean version. Thanks!

CantoneseCounsellorChatbot commented 3 years ago

@hunkim , @ghost, @zolekode , @epetros, @overnightJustifier mT5 has just been added to transformers, so now you can use it to fine-tune on your own language if mT5 supports it. See https://discuss.huggingface.co/t/mt5-t5v1-1-fine-tuning-results/2098 I haven't tested this codebase against transformers master so things might break. It would be awesome if you guys could send PR to add support for mT5 :)

Hi @patil-suraj, I managed to finetune with MT5 on Chinese based on your script which gives valid output, but I still wondering where is the ceiling of the performance. I suppose mT5 won't surpass T5, my loss is around 0.2, how much is your best performance with finetuned mdl with T5 ?

@zhoudoufu Can you tell me how to prepare the fine-tuned data? Thanks

shaktisd commented 3 years ago

@patil-suraj - Is there a support for generating questions in Hindi available ?

FayeYGY commented 3 years ago

@hunkim , @ghost, @zolekode , @epetros, @overnightJustifier mT5 has just been added to transformers, so now you can use it to fine-tune on your own language if mT5 supports it. See https://discuss.huggingface.co/t/mt5-t5v1-1-fine-tuning-results/2098 I haven't tested this codebase against transformers master so things might break. It would be awesome if you guys could send PR to add support for mT5 :)

Hi @patil-suraj, I managed to finetune with MT5 on Chinese based on your script which gives valid output, but I still wondering where is the ceiling of the performance. I suppose mT5 won't surpass T5, my loss is around 0.2, how much is your best performance with finetuned mdl with T5 ?

I have tested mt5, loss is 0.4, it is still training. Do you mind if we can discuss it in chinese?

sabhi27 commented 2 years ago

@hunkim , @ghost, @zolekode , @epetros, @overnightJustifier mT5 has just been added to transformers, so now you can use it to fine-tune on your own language if mT5 supports it. See https://discuss.huggingface.co/t/mt5-t5v1-1-fine-tuning-results/2098 I haven't tested this codebase against transformers master so things might break. It would be awesome if you guys could send PR to add support for mT5 :)

Hi @patil-suraj, I managed to finetune with MT5 on Chinese based on your script which gives valid output, but I still wondering where is the ceiling of the performance. I suppose mT5 won't surpass T5, my loss is around 0.2, how much is your best performance with finetuned mdl with T5 ?

Hi @zhoudoufu , Were you able completely finetune mt5 model on chinese? Can you please share your training pipeline so that I can try it out for other languages also?

patil-suraj / question_generation

support languages other than English #1