andrewyng / translation-agent

MIT License
4.83k stars 553 forks source link

Related study on agentic translation being used to improve traditional MT systems #9

Open enismaxim1 opened 5 months ago

enismaxim1 commented 5 months ago

If agentic translations can generate better results than traditional architectures (such as an end-to-end transformer that inputs a text and directly outputs a translation) -- which are often faster/cheaper to run than our approach here -- this also provides a mechanism to automatically generate training data (parallel text corpora) that can be used to further train and improve traditional algorithms. (See also this article in The Batch on using LLMs to generate training data.)

For those interested in this idea, a collaborator and I wrote a paper in April called "From LLM to NMT" demonstrating the viability of this approach. It turns out Claude 3 Opus is already a state-of-the-art LLM agent in machine translation in various languages. We then use the LLM to generate train-data for Yoruba-English translation and create a state-of-the-art translation system.

siddhantx0 commented 5 months ago

bet Sir.

On Tue, Jun 11, 2024 at 3:47 PM Maxim Enis @.***> wrote:

If agentic translations can generate better results than traditional architectures (such as an end-to-end transformer that inputs a text and directly outputs a translation) -- which are often faster/cheaper to run than our approach here -- this also provides a mechanism to automatically generate training data (parallel text corpora) that can be used to further train and improve traditional algorithms. (See also this article in The Batch https://www.deeplearning.ai/the-batch/building-models-that-learn-from-themselves/ on using LLMs to generate training data.)

For those interested in this idea, a collaborator and I wrote a paper https://arxiv.org/pdf/2404.13813 in April called "From LLM to NMT" demonstrating the viability of this approach. It turns out Claude 3 Opus is already a state-of-the-art LLM agent in machine translation in various languages. We then use the LLM to generate train-data for Yoruba-English translation and create a state-of-the-art translation system.

— Reply to this email directly, view it on GitHub https://github.com/andrewyng/translation-agent/issues/9, or unsubscribe https://github.com/notifications/unsubscribe-auth/AY7DCNA2I47Y6L4D4FAPET3ZG5O4PAVCNFSM6AAAAABJFB3F3WVHI2DSMVQWIX3LMV43ASLTON2WKOZSGM2DOMRYGA4TKOA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

sharpHL commented 5 months ago

If agentic translations can generate better results than traditional architectures (such as an end-to-end transformer that inputs a text and directly outputs a translation) -- which are often faster/cheaper to run than our approach here -- this also provides a mechanism to automatically generate training data (parallel text corpora) that can be used to further train and improve traditional algorithms. (See also this article in The Batch on using LLMs to generate training data.)

For those interested in this idea, a collaborator and I wrote a paper in April called "From LLM to NMT" demonstrating the viability of this approach. It turns out Claude 3 Opus is already a state-of-the-art LLM agent in machine translation in various languages. We then use the LLM to generate train-data for Yoruba-English translation and create a state-of-the-art translation system.

good job!