unicode-org / message-format-wg

Developing a standard for localizable message strings
Other
209 stars 32 forks source link

Incomplete variant set translation #52

Closed zbraniecki closed 7 months ago

zbraniecki commented 4 years ago

One of the features we floated for Fluent for quite a while now is an idea of allowing users to provide a subset of an infinite set of terms to translate in some category.

For example, city names. City names in Fluent could be translated as such:

-city-name = { $name ->
    [New York] Nowy Jork
    [London] Londyn
    [Bejing] Pekin
    [Paris] Paryż
   *[other] { $name }

location-info = You are in { -city-name($name) }

Or, incomplete user name declensions:

-user-name = { $name ->
    [Jan] { $inflection ->
       *[nominative] Jan
        [genitive] Jana
        [dative] Janowi
        [accusative] Jana
        [instrumental] Janem
        [locative] Janie
        [vocative] Janie
    }
    [Anna] { $inflection ->
       *[nominative] Anna
        [genitive] Anny
        [dative] Annie
        [accusative] Annę
        [instrumental] Anną
        [locative] Annie
        [vocative] Anno
    }
   *[other] { $name }
}

welcome-msg = Witaj { -user-name($name, inflection: "vocative") }

This is not a great message, and I certainly would prefer to use flatten lists and in fact, I'd even prefer separate UI for CAT tools when building such a message.

At the same time having ability to somehow denote an incomplete list of elements that get "improved" translation with a fallback on the original untranslated variable shows up to be useful in incrementally improving the UX without requiring building a whole new ecosystem of, say, declensed polish names.

I'd like to suggest that in some way or form we build in an ability to accept a variable and present it in the input form unless one of the "improved" version has been added for the given term.

Fleker commented 4 years ago

So having a settings page that surfaces translations to each user? Would that be useful in tying to a crowdsourcing system?

mihnita commented 4 years ago

I was also thinking about something like this. Probably less about huge lists like city names, but for shorter ones (unclear where the limit between "short" and "huge" is :-)

For example (ignore the syntax)

"You deleted the last {item}"
item in ["file", "picture", "folder", ...]

The bigger problem is when some items require changes in the sentence itself. (because gender, or number, or because they start with a vowel, whatever the case) Then what?

zbraniecki commented 4 years ago

Then what?

In my mental model for this feature, then you degrade.

The entry point is you receive something, from an outside, that you know nothing about, it's a word like "New York" or "File". You can't reason about it, you don't know its gender, plural form etc. All you know is that it has to be displayed in translation.

Without any ability to do anything better with it, you'll have to go for some semi-okayish translation.

With this model, a localizer can provide "improved" variants for some most common scenarios, which will benefit from that additional knowledge. If for a given term there is no such additional info, the worst-case scenario is exactly what you started with.

You case, in my, pardon my pseudo-syntax, approach could benefit from sth like:

-item = { $itemName ->
        [file] Plik
        [picture] Obraz
        [folder] Katalog
       *[other] { $item }
    }
    .gender = { $itemName -> 
        [file] masculine
        [picture] feminine
        [folder] neuter
       *[other] neuter
    }

msg-one = { -item.gender($itemName) ->
    [masculine] Otwórz swój { -item($itemName) }
    [feminine] Otwórz swoją { -item($itemName) }
   *[neuter] Otwórz swojego { -item($itemName) }
}
nbouvrette commented 4 years ago

Interesting idea - I'm having a hard time to picture this in English? Would you require all keys to be defined as well?

Also, knowing that most TMSes like to keep source and target files with the same amount of keys - would translator need to write the "rules/syntax" in their target languages?

aphillips commented 10 months ago

Re-consider post-2.0?