w3c / ruby-t2s-req

Text to Speech of Electronic Documents Containing Ruby: User Requirements
https://w3c.github.io/ruby-t2s-req/
Other
0 stars 4 forks source link

Heuristic for distinguishing phonetic and non-phonetic ruby? #8

Open murata2makoto opened 2 years ago

murata2makoto commented 2 years ago

@aaeventhal wrote here:

Finally, I would like to know what the possibilities are for a heuristic that detects the note/complementary situation. Can we get an evaluation on how accurate that could be?

Such a heuristic is possible, but I do not believe that it will be very reliable. For example, manga and light-novel authors go crazy and think of bizarre ways of reading kanji characters for human names. 不死川 (shinazugawa) is a good example. Now, it is widely recognized thanks to the commercial success of Demon Slayer. But it is simply impossible to enumerate all such bizarre readings.

murata2makoto commented 2 years ago

The technical committee of the Japan DAISY Consortium discussed this topic today. We think that (1) automatic detection can be quite reliable, (2) it cannot be perfect, and (3) it requires a lot of development cost and run-time cost. In other words, if we can mimic morphological analysis of modern machine translation engines and timely update of input methods for commonly-used strange names, heuristics can be quite reliable though not perfect. But can we expect this much for browser engines, which should run on mobile devices very well?

aleventhal commented 2 years ago

The technical committee of the Japan DAISY Consortium discussed this topic today. We think that (1) automatic detection can be quite reliable, (2) it cannot be perfect

Understood. The heuristic can be the fallback, but a clever author can still use a semantic to indicate when something is a note or note "do not use the default / do not guess". But falling back on the heuristic will help in the 95+% cases where the author has not added semantics.

and (3) it requires a lot of development cost and run-time cost. In other words, if we can mimic morphological analysis of modern machine translation engines and timely update of input methods for commonly-used strange names, heuristics can be quite reliable though not perfect. But can we expect this much for browser engines, which should run on mobile devices very well?

I think it can — definitely worth exploring. As an example of what's possible: some browsers and screen readers already use ML to provide automatic labelling for images that are missing alt text. This has proven to be quite useful.

murata2makoto commented 11 months ago

Two years have passed since this discussion. Having seen ChatGPT, I am more optimistic about automatic detection of phonetic ruby and non-phonetic ruby.