chinese-words-separator / chinese-words-separator.github.io

5 stars 1 forks source link

Zhuyin annotation formatting #16

Closed supercontingency closed 1 year ago

supercontingency commented 1 year ago

Traditionally Zhuyin is annotated on the right of the Hanzi character and the tone is written on the right of the Zhuyin like this

Screenshot 2023-03-16 4 49 56 PM

I used to think there is a technical limitation that prevents this to be done on browsers but recently I was surprised to find out that there is another chrome extension (Zhuyin) that actually do this and not as an image.

image

this certainly makes it more readable for people who use Zhuyin. Could this be implemented for CWS as well?

chinese-words-separator commented 1 year ago

Yes there are plenty of technical limitations that prevents it from being implemented like that

And also, the Zhuyin chrome extension does it by simply substituting a font with zhuyin being part of how the font is drawn, which produces annotation that maps to one pronunciation only regardless of context, e.g.,

image

On CWS, the annotation it produces for a hanzi is based on other hanzi it is adjacent to or belongs to

行吧 = ㄒㄧㄥˊㄅㄚ˙ 银行 = ㄧㄣˊㄏㄤˊ

On Zhuyin chrome extension, the 行 on the aforementioned words (see its screenshot above too), just maps to one pronunciation only, regardless of context. That is, 行 is always mapped to ㄒㄧㄥˊ pronunciation on Zhuyin chrome extension despite 行 having totally two different pronunciations

行吧 = ㄒㄧㄥˊㄅㄚ˙ 银行 = ㄧㄣˊㄒㄧㄥ

All that is due to Zhuyin chrome extension is just using a hanzi font with hardcoded zhuyin in it, hence it always produces one pronunciation only

chinese-words-separator commented 1 year ago

Done

The challenge was how to put the tone at the right side of the zhuyin, but it's already possible since two weeks ago, see the tone at the right side of the zhuyin of character 本 on the task Zhuyin (bopomo) intonations not displayed correctly when text is vertically aligned

I was just not inclined to add another data since I feel it might bloat the page adding another annotation, especially it's on syllable-level. Anyway, a feature is not a bloat if it have enough users and if the machine can handle the processing. I have a 6-year old smartphone, and tested that it can handle the process of adding zhuyin annotation on syllable-level just fine

Example outputs:

image image image

The feature is on version 8.24.84.1310