Language localization and internationalization

shepazu commented 1 year ago

Dan Gardner, via email on 4 August 2023:

I'm wondering from the big picture if this allows also for language localization. For example, if a text object is defined, can it have multiple language options? I'm thinking of

Different text languages for titles and labels to make graphics globally accessible.

Different braille languages, even contracted vs uncontracted (grade 1 vs grade 2).

I could see where the text and braille are hidden for display on some devices or routed to a braille display or screen reader where on others they could be shown. Option 2 is probably not an issue except in embossing since the text to braille would be handled by the screen reader to braille display function, unless someone want to force a specific braille result. I guess I'm wondering about what is in the scope of metadata. Is it just the information used to create a specific rendering in the graphic or does it apply to all objects in the graphic? I'm not sure I even understand what I'm asking, but it is really about the importance of creating objects, so they can be modified and moved around to be fit accessibility requirements, methodologies or best practices rather than using more primitive glyphs, lines and flattened graphics. Or would that be covered in a companion best practices document and examples?

Currently in IVEO and Adobe Illustrator, we use layer groups to control what is shown or not depending on capabilities we are targeting. It sounds like this is something that would be possible from this specification using the css for the device. In many cases though, we have to recreate the groups in our tools when they could be inherited from the original design.

shepazu commented 1 year ago

(Original replay via email)

Regarding localization (l12n) / internationalization (i18n), I think it does belong in the metadata, possibly in the provenance section. There’s two equally valid approaches:

Have a single document with multiple translations switched by some trigger, usually the BCP-47 language tag as configured by the user in their OS or browser language (or more often, automatically set by the OS or determined by the IP address). This is very clever, but has a few down sides:
1. It increases file size and complexity
2. It’s not very well supported by browsers, specifically in switching between different language versions, so the end user might not have the kind of control we’d want (especially multi-lingual speakers who might want to see the document in a preferred language).
3. Sometimes, localization requires changes to the graphics for clarity, not just the words.
Have multiple versions of the same document, one for each language, and document the relationship between them in the metadata and/or the distribution medium. This also has down sides:
1. Poor discoverability unless the user agent (the viewer) or CMS/marketplace exposes this metadata.
2. A user might only have access to a single-language version that is not their preferred language (i.e. it’s not quite as portable).

Overall, I prefer option 2, a single language per file. The down sides are something that more easily ameliorated through the things the user agent (reader) or distribution medium (marketplace) can control, versus requirement on the author (which they may not know or understand).

BTW, author/publisher provenance is also a way to exert soft control over access rights management, which would allow a marketplace to provide multiple languages versions for the same or a discounted price.

I’ll draw up an option for this in the next draft.

Regarding the scope of the metadata / spec, in my conception, both how braille is handled and the ability to target specific groups or elements, that’s in scope.

The whole reason to have selectors and conditional media queries is to allow for different ways of displaying the graphics or labels based on our needs.

That too will be spelled out in the next draft, and hopefully will be clearer, to allow us to discuss different options.

shepazu commented 1 year ago

John Gardner, via email on 4 August 2023:

I see the raw data in the spec Dan, since we have discussed it several times. I personally do not like the second model for language. Separate but equal?? Never is. I do not quite understand how the graphic can be language-dependent. Of course it may well be dependent on location and culture, but that is just the way it is. And good thing too!

shepazu commented 1 year ago

Dan and I had a great chat that covered this and more.

A couple points:

On language: Dan and I agreed that we would need both localization models; we need to support however authors want to do this, and there are valid use cases for both.

On localizing graphics: One of the more obvious ways graphics may have to change is the layout shifting to accommodate longer labels (such as in prolix orthographies like German), but there are also considerations of adapting symbols or colors for clarity for a particular culture, and even legal things (China doesn’t allow maps that show Hong Kong or Tibet as autonomous regions, IIRC). Arguably, those are different graphics documents, but that’s one reason we need to enable both a multilanguage and multidocument model.

We also talked about providing "lookup tables” for localization, and my next draft will take a stab at that.

Inclusio-Community / json-image-metadata

Language localization and internationalization #2