Open leolivier opened 3 months ago
To some extent, perhaps I have a method to improve this issue, which is to send more translation entries at once. This can also save the system prompts that are repeatedly sent each time. I'm planning to make improvements on this in the next version.
Hi @ryanhex53 This is not an issue, just me sharing my thoughts on efficiently translating po files with LLMs...
I stumbled upon your project (and others) while looking for a way to translate po files from an existing translated one to help disambiguate translations. For example, if you provide a po file with
No tool, no matter how intelligent, is ever going to know if it's talking about a financial company or the banks of a river. But if you provide a French translation:
then any translation capable LLM should be able to translate that po file entry into Spanish or German or whatever language it knows.
Have you ever thought about this?
Actually, with the help of Claude and ChatGPT (free versions ;) I tried it myself and ended up with this pretty simple piece of Python code (I know yours is Typescript, it's just for the example) that
(You'll need to `pip install' several libraries before running this code (I've done several tests so I'm not sure they're all still needed).
)
Unfortunately, this does not work very well. I think I have to 1rst deal with the placeholders included in the po files (e.g. {some_variable} or %s or %(some_variables)s ...) and probably provide a much better prompt to explain to the model how to use the context for translation...
So, I had a look on github, where there are a lot of "gpt for po" projects like yours, but I didn't find any that use an already done translation as disambiguation context, although I think this is absolutely key for po, where sentences are very short (even just one word) and thus don't provide enough context for the translator to work properly...
Maybe the start of a PhD thesis :D