Closed murata2makoto closed 2 months ago
I think that this example demonstrates a real problem of using ruby annotations for TTS. It deserves to be mentioned in this note.
Here are other examples of ruby annotation containing は. If these ruby annotations (rather than ruby bases) are sent to TTS engines, は is mistakenly pronounced.
Here are examples of へ occurring in ruby annotations. Sending them (rather than ruby bases) to TTS engines is very likely to cause troubles.
Ruby in my opinion is ill-suited for TTS and should only be considered a reading aid. Normally only added to the first instance of the word in the text, and not often laid out strictly 1:1 with the annotated base text in many cases. Modern LLMs will do a better job anyway.
In the modern Japanese language, there is basically only one way to read each kana character. But は and へ are exceptions. は is usually read aloud as /ha/ but is read aloud as /wa/ when this character is used as a particle. Meanwhile, へ is usually read aloud as /he/ but is read aloud as /e/ when it is used as a particle. Thus, to read these characters correctly, correct morphological analysis is a must.
A side effect of using ruby annotations rather than ruby bases for TTS is that morphological analysis typically fails. In particular, particle は and へ in ruby annotations are sometimes mistakenly pronounced as /ha/ and /he/. This is demonstrated by the 淀藩背信 example used in the last TPAC. Such mistakes confuse users a lot.
Even when we use ruby rather than base characters for T2S, it might be possible to avoid such mistakes by inserting the space character or some other silent Unicode character between the immediately preceding は or へ and the ruby text.
Should we describe this problem in this note?