whatwg / html

HTML Standard
https://html.spec.whatwg.org/multipage/
Other
8.15k stars 2.67k forks source link

Support language of parts for portions of the title tag contents #8279

Open ChasBelov opened 2 years ago

ChasBelov commented 2 years ago

If a title contains two or more languages, it is not currently possible to mark it for the WCAG-required language of parts (Success Criterion 3.1.2).

For example, if a page is titled "Este es el titulo de esta pagina (This is the title of this page)" on a page with the overall language indication lang="es", then the following tag examples both violate language of parts:

<title>Este es el titulo de esta pagina (This is the title of this page)</title>
<title lang="en">Este es el titulo de esta pagina (This is the title of this page)</title>

The following would meet language of parts, but is currently a violation of HTML spec:

<title>Este es el titulo de esta pagina <span lang="en">(This is the title of this page)</span></title>

I am requesting that this be made permitted under the spec.

brennanyoung commented 2 years ago

This proposal (as is) requires first that span be permitted inside title. Right now, title may contain only a flat string (no tags).

The use case (multilingual document title) is genuine enough, though, and might be conceivably addressed in some other way.

Just browsing RFC 5646 aka. BCP 47 (which specifies the format for lang attribute values) and it mentions "mul" (i.e. "multilanguage") as a possible value.

Apparently it is also permitted to provide a list of language values, but I don't know how well this gets handled today.

Even so, if the spec already permits something like lang="en, es", there is a shorter road ahead for this issue.

r12a commented 2 years ago

Even so, if the spec already permits something like lang="en, es",

It doesn't. The lang tag identifies the language of a range of content so that spellcheckers, special line break rules, hyphenation, voice browsers, etc can know what they are dealing with (see https://www.w3.org/International/questions/qa-text-processing-vs-metadata.en.html) and apply appropriate algorithms. So it needs to be a single language.

In a similar way, using mul doesn't provide the browser with very helpful information – it still needs to know which part is spanish and which english so that it can apply the appropriate heuristics.

So yes, some inline markup is needed, but the Internationaliztion WG tried to argue that case before and didn't get far. I still think it's a good idea, but i'm not holding my breath.

And btw inline markup would help us manage text direction in the same way as in other parts of the HTML page, too, which would be nice. At least it's possible to fall back to invisible Unicode formatting characters for that, though they aren't ideal. No such fallback for language, though.

hope that helps