WeblateOrg / weblate

Web based localization tool with tight version control integration.
https://weblate.org/
GNU General Public License v3.0
4.38k stars 976 forks source link

Per translator review status #2699

Open comradekingu opened 5 years ago

comradekingu commented 5 years ago

The single most important thing to ensure quality is knowing what strings are seen, and which ones aren't, more importantly, by whom. I think this would improve quality quite a bit, because it makes it easier to focus without putting a lot of time into keeping track of things, and it allows people that only want to review to be part of the process.

comradekingu commented 5 years ago

The alternative to per-translator status review is #1203

A binary self-review falls short of a confidence level to issue a suggestion/translation/review by, at ones own ability, where "needs review" is now someone's opinion, or extra even more unknown parties, of a bad source string, badly translated string, possible own or mistakes of others, or research needed. Per-reviewer review serves to avoid mistakes of ones own making, to the extent of ones own knowledge of them.

A second pass is better because the first pass is, in so doing, the third and the fourth is too, albeit for diminishing returns, an improvement. What it does beyond ensuring everyone is incapable of finding flaws, is increase the risk that someone disagrees, and I, for one, really want to know who disagrees, and for what reason.

This presents a venue of opportunity in challenging a lot of people at once. One of them will likely take the time to check up on the change, and as certainty increases, likely revert the change, and maybe even take the time to explain why. Or they will at best all learn at once.

Everyone trusting themselves to translate, or review, should know that unless adding value, not only will someone will see it, it will be fixed. In the best interest of adding value that isn't great, there is no real way to say a string, or project is about 85% right, but 85% right sure is a lot better than not translated. Maybe in the future the string complexion could be rated too, but I digress.

Per review status is where to start, it is the only way to improve quality short of miracles, time nobody has, or perfect auto-translation. Adding to that confidence, even knowing who's reviews are overturned, per account doing it, and/or on the whole, adds some level of confidence.

Messing up the one "needs work", or the one "reviewed"-status, is a disaster, especially so if it isn't malicious, because right now someone needs to see and remember why it was applied right away. I don't think there is an e-mail for checking or unchecking "needs work".

What only first party review does is put strings in the category of "no second party review". That is a good place to start reviewing for anyone else, after doing the "no-review" strings.

Reviewing ones own strings, and having a system to do it, to remember what strings have been reviewed, is also better for the same reason knowing who to trust becomes possible. The premise of making mistakes and trying to avoid them is the same all the way down, so it should be no-one, and at best one person, and (only) by extension, many such persons, and automated systems.

Primarily using roles to forego this is a broken system if quality is the issue. Sometimes projects require review for everything, when the (only) practical alternative is superior.

comradekingu commented 4 years ago

The rationale is this: Can a pure language administrator ensure quality is good? No. It is masking the problem to make this situation "better", ending up in some sort of Transifex mayhem. Any amount of non-AI tools won't change this.

Fundamentally, the best measure of quality we have is one translator looking at one string, to make sure it is good. The knowledge of what other translators to trust makes sense here, because one can use that to decide that to review others first. If the quality status wasn't just a binary "needs work", and that too was per user, one could start with unknown users making few changes, and of those, from the strings determined to need higher amount of work.

What this results in is instead of testing how scalable the skill of any one translator is, in their ability to keep up with all changes, quality instead compounds. That is because projects will be fully reviewed, beyond any one persons ability to make two accounts to cheat the system. It will finally be clear what strings are not seen as often, and for a third and final measure, one can get more consistency by starting with projects that are not fully reviewed by any one translator.

Untranslated string, for age of string Own translated string, certainty of translation Other translators translated string, deemed to need more work Other translator's translated string, starting from unknown translator having few edits Projects not fully translated Projects not fully reviewed Languages not fully reviewed