andrewyng / translation-agent

MIT License
3.55k stars 353 forks source link

Related study on agentic translation being used to improve traditional MT systems #9

Open enismaxim1 opened 2 weeks ago

enismaxim1 commented 2 weeks ago

If agentic translations can generate better results than traditional architectures (such as an end-to-end transformer that inputs a text and directly outputs a translation) -- which are often faster/cheaper to run than our approach here -- this also provides a mechanism to automatically generate training data (parallel text corpora) that can be used to further train and improve traditional algorithms. (See also this article in The Batch on using LLMs to generate training data.)

For those interested in this idea, a collaborator and I wrote a paper in April called "From LLM to NMT" demonstrating the viability of this approach. It turns out Claude 3 Opus is already a state-of-the-art LLM agent in machine translation in various languages. We then use the LLM to generate train-data for Yoruba-English translation and create a state-of-the-art translation system.

siddhantx0 commented 2 weeks ago

bet Sir.

On Tue, Jun 11, 2024 at 3:47 PM Maxim Enis @.***> wrote:

If agentic translations can generate better results than traditional architectures (such as an end-to-end transformer that inputs a text and directly outputs a translation) -- which are often faster/cheaper to run than our approach here -- this also provides a mechanism to automatically generate training data (parallel text corpora) that can be used to further train and improve traditional algorithms. (See also this article in The Batch https://www.deeplearning.ai/the-batch/building-models-that-learn-from-themselves/ on using LLMs to generate training data.)

For those interested in this idea, a collaborator and I wrote a paper https://arxiv.org/pdf/2404.13813 in April called "From LLM to NMT" demonstrating the viability of this approach. It turns out Claude 3 Opus is already a state-of-the-art LLM agent in machine translation in various languages. We then use the LLM to generate train-data for Yoruba-English translation and create a state-of-the-art translation system.

— Reply to this email directly, view it on GitHub https://github.com/andrewyng/translation-agent/issues/9, or unsubscribe https://github.com/notifications/unsubscribe-auth/AY7DCNA2I47Y6L4D4FAPET3ZG5O4PAVCNFSM6AAAAABJFB3F3WVHI2DSMVQWIX3LMV43ASLTON2WKOZSGM2DOMRYGA4TKOA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

sharpHL commented 1 week ago

If agentic translations can generate better results than traditional architectures (such as an end-to-end transformer that inputs a text and directly outputs a translation) -- which are often faster/cheaper to run than our approach here -- this also provides a mechanism to automatically generate training data (parallel text corpora) that can be used to further train and improve traditional algorithms. (See also this article in The Batch on using LLMs to generate training data.)

For those interested in this idea, a collaborator and I wrote a paper in April called "From LLM to NMT" demonstrating the viability of this approach. It turns out Claude 3 Opus is already a state-of-the-art LLM agent in machine translation in various languages. We then use the LLM to generate train-data for Yoruba-English translation and create a state-of-the-art translation system.

good job!