PixarAnimationStudios / OpenUSD-proposals

Share and collaborate on proposals for the advancement of USD
92 stars 25 forks source link

Language tags for text proposal #49

Closed meshula closed 1 month ago

meshula commented 1 month ago

@PierreWang wrote:

Q: Language attribute may be required, so that the implementation will know what language the script is. A: We can decide the script from the unicode of the character. Because the font contains a limit set of charsets, the charset of the character may not be supported by the current font. In that case, we may display the character using a default character (such as a blank rectangle), or use font substituion to find another font which can display the character. So I think we don't need to add the language attribute.

In https://github.com/PixarAnimationStudios/OpenUSD-proposals/pull/48 @dgovil wrote:

You have various issues like Han unification that prevent a script being truly inferable, but outside of that it's generally not an issue.

A language consideration for the text proposal would make sense.

It is usefully separate from accessibility as well since it could be shared across multiple domains. Imagine a scenario where you have text in your scene but want to present different text per language as well.

In effect what I'd propose is that the USD file have a stage level metadata (and possibly prim level with Spiff's new proposal) to specify language. I'd also propose that this be allowed to be a purpose on attributes for when it differs from the primary language.

Something like

#usda 1.0
(
     language = "en_ca"
)

def foo {
     string text = "Colours are awesome"
     string text:en_us = "Colors are awesome, but the letter U is not"
     string text:fr = "La couleur est géniale"
}

The benefit here is that you can also specify translations in separate USD files and layer them in.

Another reason I'd advocate for language tags is that a theoretical screen reader could enunciate words differently based on the language specified.

meshula commented 1 month ago

@dgovil

On accessibility, would there be a need for a suffix that says "tts" (text to speech) or other affordance? Or are those concerns strictly orthogonal?

Would you imagine that USD would normatively list "fr", "en_us", "braille" etc., or would it normatively reference a master list somewhere?

meshula commented 1 month ago

@spiffmon

Capturing your feedback here:

I really like the multiple language encoding idea, @dgovil . Perhaps not as germaine for labels on parts, but for anything more conversational or longer, it'll be useful to be able to intentionally provide non-auto-generated translations.

And I'd advocate skipping layer metadata entirely for the "default encoding" tag, and go directly to an API schema; in multi-user environments, you'll likely see avatars being referenced in from many different parts of the world, I'd imagine?

meshula commented 1 month ago

Further, on alt-text:

Q: The "alt-text" solution if we can not support the character. A: Sure if a character is not supported, we need to display a default character, or we will use font substitution to find another font which can support the character.

dgovil commented 1 month ago

Thanks, Nick. I'll put together a more formal proposal for this and share it up as soon as I can get a chance.

dgovil commented 1 month ago

I put up a proposal here https://github.com/PixarAnimationStudios/OpenUSD-proposals/pull/55 that I feel is quite generic so can be applied across multiple uses if need be

meshula commented 1 month ago

Thanks Dhruv, to minimize redundancy I am closing this issue so conversation may continue on pr #55.