andrewyng / translation-agent

MIT License
4.88k stars 564 forks source link

Translation Agent: Agentic translation using reflection workflow

This is a Python demonstration of a reflection agentic workflow for machine translation. The main steps are:

  1. Prompt an LLM to translate a text from source_language to target_language;
  2. Have the LLM reflect on the translation to come up with constructive suggestions for improving it;
  3. Use the suggestions to improve the translation.

Customizability

By using an LLM as the heart of the translation engine, this system is highly steerable. For example, by changing the prompts, it is easier using this workflow than a traditional machine translation (MT) system to:

This is not mature software, and is the result of Andrew playing around with translations on weekends the past few months, plus collaborators (Joaquin Dominguez, Nedelina Teneva, John Santerre) helping refactor the code.

According to our evaluations using BLEU score on traditional translation datasets, this workflow is sometimes competitive with, but also sometimes worse than, leading commercial offerings. However, we’ve also occasionally gotten fantastic results (superior to commercial offerings) with this approach. We think this is just a starting point for agentic translations, and that this is a promising direction for translation, with significant headroom for further improvement, which is why we’re releasing this demonstration to encourage more discussion, experimentation, research and open-source contributions.

If agentic translations can generate better results than traditional architectures (such as an end-to-end transformer that inputs a text and directly outputs a translation) -- which are often faster/cheaper to run than our approach here -- this also provides a mechanism to automatically generate training data (parallel text corpora) that can be used to further train and improve traditional algorithms. (See also this article in The Batch on using LLMs to generate training data.)

Comments and suggestions for how to improve this are very welcome!

Getting Started

To get started with translation-agent, follow these steps:

Installation:

pip install poetry
import translation_agent as ta
source_lang, target_lang, country = "English", "Spanish", "Mexico"
translation = ta.translate(source_lang, target_lang, source_text, country)

See examples/example_script.py for an example script to try out.

License

Translation Agent is released under the MIT License. You are free to use, modify, and distribute the code for both commercial and non-commercial purposes.

Ideas for extensions

Here are ideas we haven’t had time to experiment with but that we hope the open-source community will:

Related work

A few academic research groups are also starting to look at LLM-based and agentic translation. We think it’s early days for this field!