hendricius / the-sourdough-framework

Open source book dedicated to helping you to make the best possible sourdough bread at home.
https://breadco.de/book
Creative Commons Attribution Share Alike 4.0 International
2.57k stars 128 forks source link

[WIP] Translation with OpenAI #215

Closed pacoccino closed 6 months ago

pacoccino commented 10 months ago

This is a work in progress, not ready to merge. There are still many things to think about.

I started creating a small python script that automatically convert the book with GPT-4. This is not a perfect solution, and this doesn't translate images and charts, but it can be a good starting point for having many translations ready, open to manual review later on. GPT is instructed to translate Latex files, so it only translates things that needs to be translated and do a good job keeping syntax and compilation ok.

I don't know how to handle updates to the original book yet though.

So here is a MVP script code to generate translations from .tex files, the workflow is very basic and need to be improved:

I already uploaded roughly half of the book in french here: https://github.com/pacoccino/the-sourdough-framework/pull/2 Translating the full book in one langage would roughly cost 3 hours and 20$

What needs to be decided now:

hendricius commented 10 months ago

That's pretty impressive work! @cedounet - this is also a good reason to put the images containing text into the repo without text, then we can translate it much easier.

@pacoccino thanks a lot! This would be amazing if we could somehow automatically create the translated version on every release 😎

pacoccino commented 10 months ago

@pacoccino thanks a lot! This would be amazing if we could somehow automatically create the translated version on every release 😎

Running the translation on every releases may not be a good idea, as it's very long and expensive. Maybe there's something to do with GPT analysing the updates from git and modifying translation accordingly but that looks like too much work. I was more thinking of this as a base start for creating different langages, then manually update/correct translations.

But the automation script should release the book in every language already processed.

cedounet commented 10 months ago

Hello.

This is great. A couple of comments (without looking too much at the code)

1) make it clear this is automatically translated… had a quick look at your French version. Which like all thibgs chatGPT goes from impressive to hallucinations is some sentences :)

2) use babel so that latex does the right thing

3) make it fit inside the makefile. Probably not something we want to run every time, but shall be feasible from the makefile if needed. (Pending credentials, all that)

great job though.

c

hendricius commented 6 months ago

I will be closing this PR for now. Keep us posted if you want to work on some updates in the near future 😎