Closed twotwoiscute closed 1 year ago
obviously not
Thanks for the reply,Can you tell me the pipeline if I would like to train the model with Chinese dataset(suppose the data is already labeled).
I think just modify the tokenizer and instructions is OK, but my concern is that I can't reproduce result. Did you have this question ?
I think the reason you can not reproduce the result is that the pretrained model you used is only trained on english only], there's another pretrained model trained on multiple language.
But I dont notice that in his paper 😂, I'm also insterest in this model, would you mind leave me your wechat ? I look forward to further communication with you. ❤️ my wechat: SupritYoung email: suprit@foxmail.com
But I dont notice that in his paper joy, I'm also insterest in this model, would you mind leave me your wechat ? I look forward to further communication with you. heart my wechat: SupritYoung email: suprit@foxmail.com
Sure, emm.. just so you know I used to study computer vision, just started to working on NLP.
tk-instruct is finetuned on instructions based on the T5 model which is just in English. However, you can use the one @SupritYoung posted. There are two variants of mtk which is the 3B and 11B variant. However, 3B and 11B would require significant compute requirements. You can also try finetuning using the mt5 base as well. But this would require instruction tuning that will have to be done additionally.
Thanks for great work, I wonder is there other dataset that is not english-based you used during training?