Open benmccann opened 5 months ago
Handling the whole document as a single string would need to redo the whole translation when a single sentence changes. Smaller units also make it easier to progress with the translation than translating a full document at once.
Handling the whole document as a single string would need to redo the whole translation when a single sentence changes.
You wouldn't need to redo the entire document, but rather simply update the changed part. This is why I suggested having a diff tool integrated to show what changed
Smaller units also make it easier to progress with the translation than translating a full document at once.
True. It probably depends on how large the markdown files are. It might be reasonable enough if each one has a page or two of content. It would be more difficult if you've got a document that's ten, fifty, it a hundred pages long - though that's not the common case for a website.
Another idea might be to split the document on a heading level configurable by the user. E.g. split on ##
. This would break up the document, but in an easier fashion that doesn't require a parser and that could more easily be reconstituted
This issue has been put aside. It is currently unclear if it will ever be implemented as it seems to cover too narrow of a use case or doesn't seem to fit into Weblate.
Please try to clarify the use case or consider proposing something more generic to make it useful to more users.
Most users are happy with how Markdown is currently handled, so that is not going to change.
I'm not opposed to having another option for Markdown translating in Weblate, but I don't intend to push that myself.
Describe the problem
Most translation tools have historically been geared towards applications such as inserting strings into a programming language. However, markdown is a fairly different use case because the idea behind it is that it's a human-readable document format.
While I'm not overly familiar with the internals, I believe that the current markdown implementation tries to split the document up into smaller po strings, translate them, and then output a markdown document from that. This process can be error prone, make it harder to see context, and is harder to implement than handling the entire document as a single string. Most importantly, it has a huge caveat highlighted in the docs:
Describe the solution you would like
Let translators handle the document in its entirety. Store versions of the document and provide a diff tool so that when a document is updated, translators can more easily see which portions changed.
This could either replace the current implementation or users could have the option of choosing between the two
Describe alternatives you have considered
Possibly improve the current implementation. However, it seems that there are fundamental limitations of the current approach that would make it not possible to solve some of the difficulties.
Additional context
See also issues like https://github.com/WeblateOrg/weblate/issues/10008 and https://github.com/WeblateOrg/weblate/issues/9786