Open zolekode opened 4 years ago
Thank you @zolekode .
For training in other languages we'll need a pre-trained model and dataset in that language. I'm not sure if there are pre-trained seq-to-seq models available for other languages right now.
Without a pre-trained model the quality of the questions won't be as good as it is now.
@patil-suraj makes sense. Thanks alot. I will look into that
@patil-suraj We do have KorGPT, https://github.com/SKT-AI/KoGPT2 and https://korquad.github.io/. In this case, could you help to generate QAs for Kor?
Hi @hunkim , I do want to add support for GPT-2 in a week or two. Do you think you can upload this model on HF model hub so that it can be easily integrated here ?
As GPT-2 is not a seq-to-seq model, we'll need a different way to do QG with GPT-2. One way we can do answer aware QG with GPT-2 is prepare our input like this
Our context is 42 is the answer to life, the universe and everything
, answer is 42
and target question is What is the answer to life, universe and everything ?
Then
input text: context: <hl> 42 <hl> is the answer to life, the universe and everything. question: What is the answer to life, universe and everything ?
and prepare the attention mask such that, there will be no attention from question: ...
part, so model won't look into future tokens and calculate loss only on the question: ...
part. And it inference time we will feed only the context part and ask the model to generate the question.
Feel free to take a stab.
Thank you @zolekode .
For training in other languages we'll need a pre-trained model and dataset in that language. I'm not sure if there are pre-trained seq-to-seq models available for other languages right now.
Without a pre-trained model the quality of the questions won't be as good as it is now.
Thanks for the library. Having the dataset (wikipedia/news articles) in another language, how can I go about fine tuning t5 to it? Thanks!
Hi @patil-suraj Thanks a lot for sharing this code. It's awesome. As mT5 has been released a few weeks ago by Google and as Huggingface is working to add it into their library, do you plan to upgrade your repo with multi-lingual support ? Cheers Philippe
@hunkim , @ghost, @zolekode , @epetros, @overnightJustifier
mT5 has just been added to transformers, so now you can use it to fine-tune on your own language if mT5 supports it. See https://discuss.huggingface.co/t/mt5-t5v1-1-fine-tuning-results/2098 I haven't tested this codebase against transformers master so things might break.
It would be awesome if you guys could send PR to add support for mT5 :)
@hunkim , @ghost, @zolekode , @epetros, @overnightJustifier
mT5 has just been added to transformers, so now you can use it to fine-tune on your own language if mT5 supports it. See https://discuss.huggingface.co/t/mt5-t5v1-1-fine-tuning-results/2098 I haven't tested this codebase against transformers master so things might break.
It would be awesome if you guys could send PR to add support for mT5 :)
Hi @patil-suraj, I managed to finetune with MT5 on Chinese based on your script which gives valid output, but I still wondering where is the ceiling of the performance. I suppose mT5 won't surpass T5, my loss is around 0.2, how much is your best performance with finetuned mdl with T5 ?
@patil-suraj wonderful news!
@zhoudoufu Could you share the Chinese version? I will modify it for a Korean version. Thanks!
@hunkim , @ghost, @zolekode , @epetros, @overnightJustifier mT5 has just been added to transformers, so now you can use it to fine-tune on your own language if mT5 supports it. See https://discuss.huggingface.co/t/mt5-t5v1-1-fine-tuning-results/2098 I haven't tested this codebase against transformers master so things might break. It would be awesome if you guys could send PR to add support for mT5 :)
Hi @patil-suraj, I managed to finetune with MT5 on Chinese based on your script which gives valid output, but I still wondering where is the ceiling of the performance. I suppose mT5 won't surpass T5, my loss is around 0.2, how much is your best performance with finetuned mdl with T5 ?
@zhoudoufu Can you tell me how to prepare the fine-tuned data? Thanks
@patil-suraj - Is there a support for generating questions in Hindi available ?
@hunkim , @ghost, @zolekode , @epetros, @overnightJustifier mT5 has just been added to transformers, so now you can use it to fine-tune on your own language if mT5 supports it. See https://discuss.huggingface.co/t/mt5-t5v1-1-fine-tuning-results/2098 I haven't tested this codebase against transformers master so things might break. It would be awesome if you guys could send PR to add support for mT5 :)
Hi @patil-suraj, I managed to finetune with MT5 on Chinese based on your script which gives valid output, but I still wondering where is the ceiling of the performance. I suppose mT5 won't surpass T5, my loss is around 0.2, how much is your best performance with finetuned mdl with T5 ?
I have tested mt5, loss is 0.4, it is still training. Do you mind if we can discuss it in chinese?
@hunkim , @ghost, @zolekode , @epetros, @overnightJustifier mT5 has just been added to transformers, so now you can use it to fine-tune on your own language if mT5 supports it. See https://discuss.huggingface.co/t/mt5-t5v1-1-fine-tuning-results/2098 I haven't tested this codebase against transformers master so things might break. It would be awesome if you guys could send PR to add support for mT5 :)
Hi @patil-suraj, I managed to finetune with MT5 on Chinese based on your script which gives valid output, but I still wondering where is the ceiling of the performance. I suppose mT5 won't surpass T5, my loss is around 0.2, how much is your best performance with finetuned mdl with T5 ?
Hi @zhoudoufu , Were you able completely finetune mt5 model on chinese? Can you please share your training pipeline so that I can try it out for other languages also?
Hi @patil-suraj your library is really amazing and i would like to contribute. Any tips on how to train say for example in other languages ?