Closed burner1024 closed 3 years ago
AFAIK this will break later with gettext - eg. msgmerge will remove contexts not present in pot. Also how would application choose which string to use when translators can add own contexts? I think this is really best to handle already in the code and have gender specific messages there...
That's why I mentioned that that will require fixing po/pot simultaneously.
In this particular use case, the translation will result in two separate packages (and it's not PO files, it's getting converted to and from PO just for the purposes of translation).
Handling it in the code requires going through all the entries and deciding which ones should be contexted and which shouldn't, beforehand. Which programmers can't do. And having translators mess with code is exactly what Weblate is built to avoid, isn't it?
This really sounds quite specific use case. I'm not sure if having this in Weblate is useful as in most cases people don't want to edit context.
When doing this myself (with intention to provide 2 separate packages), I'd probably just define separate locales for each (eg. es@male
and es@female
).
Doesn't seem so specific to me. Any translation from a gender-neutral to a non gender-neutral language must face it. Maybe people don't want to edit context because they never had that option? Anyway, I see that this idea doesn't seem particularly attractive to you, so I'm closing the issue. Thank you for your answer.
The problem is that if you use gettext, it makes no sense to edit context for translator - it's used as identifier in the code, so if you add context, you have to change the code as well to actually use it.
If gettext is used directly, yes. But if it's just an intermediate format, that is not the case. And seeing the number of "po2xx" convertors, I believe it's not the case often enough.
BTW: What file format do you actually use and convert to po?
In this particular use case, it's MSG.
So you still would end up generating separate translation files for each gender?
Correct.
Coming to think of it again, it's probably not a very good idea to allow translators to add arbitrary contexts. Might lead to confusion.
Instead, if a set (or sets) of possible contexts would be predefined by the project admin, and translators then could to either "translate pristine", or "translate with context", that would allow for flexible translation while keeping it formalized.
That's what I meant in https://github.com/WeblateOrg/weblate/issues/1507#issuecomment-306259507.
Having this configurable is probably option, but still quite big feature which IMHO will not find much users...
Yes, I agree that it may not find many users. Would it be that Weblate had pluggable architecture, it might have been easier to add... When I have time, I'll try to see for myself how if that's possible, but I'm not sure if my skills are good enough yet.
For what it's worth ICU MessageFormat supports this nicely. Here is an example message:
{gender, select,
male {He}
female {She}
other {They}
} will respond shortly.
I agree with @nijel that it makes little sense to support context creation in gettext formats; all gettext tools will just discard any newly added contexts.
But Weblate doesn't support that fancy format, does it? So it's irrelevant. I'm not sure what do you mean by discarding context. I don't think gettext will discard it.
@nijel I disagree that such function won't find may user cases. What's more important it's the the lack of such gender specific distinction is a blocker for translation of more than 1000 Infinity Engine mods via weblate. But it depends how you define 'many'.
What sort of support could convince you to implement such feature in the next 3 months?
After reflecting on this for some time, I've come to conclusion that 1) The original description is overly generic. Really, it's about allowing gender-specific translations, not arbitrary contexts. 2) There's no good way to implement this until Weblate only supports translate-toolkit formats, none of which allow gender distinction. The reason is, there's no place to store female-specific strings: POT is generated automatically, and POs derive from POT.
The best thing Weblate could do (maybe with new plugin system?) is to allow to easily hook into PO file save function, and web ui translate form. Then the plugin, or hook, could save the extra strings in an extra file (say, french.po_female). Then strings from that file will be automatically picked up by po2xx converter. I'll try to implement something like this, and report the results.
Sorry for not following up for some times. Yes, complex formats like ICU message or L20n do not have any special support in Weblate. On the other side, it has no problem in showing such strings to edit by translators, so that should work without problem (but without at least syntax checker, translators will produce many non working expressions).
Generally I don't like these as they turn translations into programming language. This is something what translators usually do not handle well. The ICU one seems at least a bit limited in the expressions, but L20n has IMHO gone too far (see their complex example).
I'm planning to have several kinds of addons possible for next release, so if you're able to agree on way handling of this, it might be good way to implement this.
@ALIENQuake How do you currently store these translations?
Also if somebody wants to financially motivate me (or somebody else) to solve this issue, you can use Bountysource. It used to be integrated in GitHub, but it's broken for several months (see https://github.com/bountysource/core/issues/1096).
Here's a test implementation.
For the reference, Infinity Engine mods translations are stored in TRA files (example). In our system, they are converted to and from PO by hooks, using helper tools.
New strings in bilingual formats can be added starting with the 4.5 release: https://docs.weblate.org/en/latest/admin/projects.html#manage-strings
Sorry, I don't see how that is supposed to help. (But anyway, I've pretty much given up hope on getting it implemented, and added a hack, so it's just as well to me.)
You can now add variants of a string in Weblate, including custom context. I thought that would work here as well.
@nijel Hi, Can you be more specific? Can we separate the male and female versions of strings? How does it look at GUI? How the alternative variant is stored?
There is no specific feature for male/female version of the strings. Weblate 4.5 comes with features that you can probably utilize to achieve this though:
It is a generic solution aimed at other use cases as well.
@nijel Thank you for your work! Donation sent.
OK, not to be a downer, but just fyi - I revisited the docs, and I think that unfortunately it still does not cover the initial case, which is:
Sorry, but I don't see a solution for that besides managing strings in Weblate or using rich localization formats as Fluent which allow you to define arbitrary conditions inside the translation.
@nijel when using variants, though, where do they go in the PO file? Does Weblate "Context" from Tools menu get saved in msgctxt
? Is it possible to link them to the original string by looking at the resulting PO file? (For a given string in PO, find its variants in the PO).
And in Weblate, do they get associated to the same source file/string as the original string (source context file:string
)?
The variants are for grouping existing strings, see https://docs.weblate.org/en/latest/devel/translations.html#variants
I mean
The additional variant for a string can also be added using the Tools while translating (when Manage strings is turned on):
Yes, that adds a string with given msgctxt
.
And do they get associated to the same source context file:string? ("Occurence" in PO terms) If not, what do they get for occurence?
The strings will be associated via variant:ORIGINAL STRING
flag, see https://docs.weblate.org/en/latest/devel/translations.html#manual-variants
Yes, I got that, but what does it mean in PO terms? I mean, suppose that original PO string is this:
#: ascension.tra:1014
msgid "Focus"
msgstr "Concentración"
When a manual variant is added, how the resulting PO will look like?
The variant information is not stored in the PO file.
But you said that it adds a string with the given Context/msgctxt
.
Yes, whatever you enter in "Context" will end up in msgctxt
:
But this has nothing to do with linking these two strings together, that is currently done purely via flags in Weblate.
Sorry, I'm trying to make myself clear, but apparently not succeding. So let's say I have the original string translated in PO:
# original string
#: ascension.tra:1014
msgid "Focus"
msgstr "Concentración"
Its PO occurence is ascension.tra:1014
.
Now I go and add a manual variant in Weblate:
The question is, after the variant is added, how does it look in the PO?
This is what I expect, but some things are not clear, see the comments:
# original string
#: ascension.tra:1014
msgid "Focus"
msgstr "Concentración"
# ADDED VARIANT. IS THIS HOW IT WILL LOOK?
#: WHAT IS HERE? (occurence)
msgid "Focus"
msgstr "Concentración-alt"
msgctxt "alt-context"
It will have no additional information:
# original string
#: ascension.tra:1014
msgid "Focus"
msgstr "Concentración"
msgctxt "alt-context"
msgid "Focus"
msgstr "Concentración-alt"
OK, I got it, thank you for explanation. I might be able to work with that.
@nijel one more question, if I may. Supposing that I have strings and their variants already in PO as described above, but variants are not marked as such, how can I bulk mark the variants so that Weblate knows that they are fact variants?
I assume some kind of script as for string1 in (select * from strings); if count(select string2 from strings where string1.msgid == string2.msgid)>1; then insert into variants (string.id, string2.id); done
. Should I use SQL? API? CLI?
Obviously not asking for a complete solution, just some pointers. Can't find docs on SQL schema.
Using API should work for this:
variant:"string"
to all matching units using https://docs.weblate.org/en/latest/api.html#patch--api-units-(int-id)-The SQL schema is generated by Django ORM, there is currently no documentation on that (and there probably won't ever be any as it is not recommended to use SQL directly as there might be logic constrains not being applied at SQL level).
Edit: see comment for clarification.
Hi. Is it possible to add a option for translators to manage msgctxts, including adding new ones?
I'm not sure if that's something needed or even viable, so let me describe the situation first:
So... that looks like a long shot, but I was thinking of a feature that would allow the translator to "split" the source string into several contexted ones and translate them individually? That will require adding newly created entries into po and pot, though.
Or maybe I'm missing an obvious way to handle such a situation? Please advise.