mozilla / readability

A standalone version of the readability lib
Other
8.7k stars 594 forks source link

Ruby-annotated characters are invisible in reader view #758

Open kbrosnan opened 2 years ago

kbrosnan commented 2 years ago

Steps to reproduce

  1. Go to https://ja.wikisource.org/wiki/%E7%AB%B9%E5%8F%96%E7%89%A9%E8%AA%9E_(%E5%9C%8B%E6%B0%91%E6%96%87%E5%BA%AB)
  2. Switch on reader view.

Expected behaviour

Characters with ruby annotation in reader view should be visible.

In non-reader view, ruby text and characters are rendered correctly:

Non Reader View

Actual behaviour

Characters with ruby annotation in reader view are not visible.

Reader View

Device name

Redmi K30i 5G

Android version

Android 11

Firefox release type

Firefox

Firefox version

97.2.0

Device logs

No response

Additional information

No response

┆Issue is synchronized with this Jira Task

kbrosnan commented 2 years ago

According to the Fenix bug this is reproducible on desktop.

gijsk commented 2 years ago

This is because the content of the <rb> tag has aria-hidden=true.

I'm afraid I don't speak Japanese and know effectively nothing about ruby tags. Does anyone know why wikisource would make the ruby-annotated characters aria-hidden, ie hide them from assistive technology?

gijsk commented 2 years ago

Various instances of wikimedia have different templates for ruby elements. English wikipedia has one specific to Japanese, and a more generic one, neither of which hides the contents of rb.

Japanese wikipedia appears to have a NORUBY rule, and its ruby template has some notes at the bottom that (google translated into English) claim the following:

{{ Reading }}- A template that hides the reading kana of a proper noun that is difficult to read for voice reading software ( screen reader ).

So from these snippets of context I gather that in some cases (some parts of) the contents of the ruby tag are deemed too difficult for the screenreader to deal with. However, both of those templates (1, 2) appear to be using font-size and/or other CSS to hide things (in some cases just relegating part of the content of the template to a title attribute instead of rendering it in the main HTML), not aria-hidden.

I don't know what the context of the Japanese wikisource's template choice for aria-hidden is here, but I feel like I can't just go in and change the template.

We could potentially update readability to not drop the rb tag if that's what has aria-hidden on it, and keep the aria-hidden attribute? But ideally I'd really like some feedback from people who both (a) understand the Japanese and (b) understand the screenreader impact and ideally even (c) know some of the history of why this template ended up the way it did. So far, input on Mozilla's internal slack from screenreader experts (who aren't experts on (a)/(c)) is that the use of aria-hidden in this context seems misguided.