Open pavaris-pm opened 6 months ago
Yes, I agree.
Typhoon-7b are bilingual llm, so I think if somenoe train instruct fellow by English, It should can working with Thai too!.
I am welcome if the model doesn't use the data from ChatGPT (example ShareGPT, self-instruct that use ChatGPT data for create the dataset).
Yes, I agree.
Typhoon-7b are bilingual llm, so I think if somenoe train instruct fellow by English, It should can working with Thai too!.
I am welcome if the model doesn't use the data from ChatGPT (example ShareGPT, self-instruct that use ChatGPT data for create the dataset).
Already set in the goal! you can wait for my upcoming PR krub.
In a couple days before, I've seen that we also have a chat/generate features that utilize wangchanglm as a current LLM model for text generation ability. Moreover, there has an upcoming LLM known as Typhoon-7b from scb10x which bring a wow factor into Thai LLM with evaluation on Thai examination task. Due to this new wave of Thai LLM, do we need to add Typhoon-7b as an optional engine of PyThaiNLP? What do you think?
Ps. I'm not sure that it will produce inappropriate word or not since they claim that it has no moderation mechanism. Maybe I can fine-tune it with some samples (e.g. 1k text samples) in order to adjust their mood and tone for more appropriate generation as well. You can suggest.