PyThaiNLP / pythainlp

Thai Natural Language Processing in Python.
https://pythainlp.org/
Apache License 2.0
936 stars 272 forks source link

Retraining Machine Translation model for Thai-English and English-Thai #899

Open wannaphong opened 5 months ago

wannaphong commented 5 months ago

Hello! I am working train new Machine Translation model for Thai-English and English-Thai. It's may doesn't done in v5.0.0 deadline but I hope new model will include in the next release of PyThaiNLP (v5.0.1 or other).

The new models are not Generative Pre-trained Transformers model and it will can working with huggingface transformers.

Dataset: https://github.com/vistec-AI/thai2nmt/releases/tag/scb-mt-en-th-2020%2Bmt-opus_v1.0

pavaris-pm commented 5 months ago

@wannaphong it is a very excellent project to work on! However, could I have a quick question to ask for the current model that we've used for translation task? so that I can have some further research on different methods.

wannaphong commented 5 months ago

@wannaphong it is a very excellent project to work on! However, could I have a quick question to ask for the current model that we've used for translation task? so that I can have some further research on different methods.

It is support various domains such as product reviews, laws, report, news, spoken dialogues, and SMS messages.

You can read scb-mt-en-th-2020: A Large English-Thai Parallel Corpus.

bact commented 4 months ago

May relevant to #903

wannaphong commented 4 months ago

I have a computing problems, so this issue will be future.