openvanilla / McBopomofo

小麥注音輸入法
http://mcbopomofo.openvanilla.org/
MIT License
631 stars 77 forks source link

Fix phonetics for 俄 #535

Closed xatier closed 1 month ago

xatier commented 2 months ago

The concise dictionary prefers ㄜˊ, we provide both for fault tolerance.

https://dict.concised.moe.edu.tw/dictView.jsp?ID=39379&q=1

俄頃 and 俄而 are the exceptions, as they have different meanings.

lukhnos commented 1 month ago

@xatier I'm more reluctant accepting this PR and #530. I think it comes to a few things. There's the diminishing return of this type of efforts in the name of fault tolerance—are there evidence that this really helps a Taiwanese Mandarin user?

I understand that "this is what the official dictionary says", but anecdotally reading 俄 as ㄜˋ is very typically Taiwanese Mandarin whereas 俄 as ㄜˊ is clearly not. #530 is similar.

xatier commented 1 month ago

I do not have a strong preference on having these proposed pronunciations for these, either. What I do notice is that we already have a fixed combination from both sides (the concise dictionary form vs. common form) for many characters, providing consistency and fault tolerance is my goal for these PRs. https://github.com/openvanilla/McBopomofo/pull/530 is a great example of this, the current dictionary file is highly inconsistent.

As all my previous PRs, feel free to close these them if you find them not providing value for 小麥 users.

xatier commented 1 month ago

IMHO, I see 亞 and 俄 are very similar to https://github.com/openvanilla/McBopomofo/pull/522, I would like to have both pronunciations if we find them are used with equal frequency. Another approach is to ensure all terms have ㄧㄚˇ and ㄜˋ, but NOT add the concise dictionary forms (ㄧㄚˋ and ㄜˊ) to them while keeping the existing ones without deletions.

We can also review them case-by-case, just let me know which terms to exclude.

Please let me know which approach you would prefer, and we can start from there. :)

lukhnos commented 1 month ago

Another approach is to ensure all terms have ㄧㄚˇ and ㄜˋ, but NOT add the concise dictionary forms (ㄧㄚˋ and ㄜˊ) to them while keeping the existing ones without deletions.

I like the idea, but let's be a bit more nuanced and conservative—for 亞 I'm comfortable with adding the missing ㄧㄚˇ to the entries that only have the ㄧㄚˋ reading. For 俄 I suggest you only add ㄜˋ to entries that are related to 俄國、俄語系 etc., that is when it's used in a proper name. The thing is that Classical Chinese trems such as 俄頃 are never read with the fourth tone even in Taiwanese Mandarin.

tianjianjiang commented 1 month ago

Another approach is to ensure all terms have ㄧㄚˇ and ㄜˋ, but NOT add the concise dictionary forms (ㄧㄚˋ and ㄜˊ) to them while keeping the existing ones without deletions.

I like the idea, but let's be a bit more nuanced and conservative—for 亞 I'm comfortable with adding the missing ㄧㄚˇ to the entries that only have the ㄧㄚˋ reading. For 俄 I suggest you only add ㄜˋ to entries that are related to 俄國、俄語系 etc., that is when it's used in a proper name. The thing is that Classical Chinese trems such as 俄頃 are never read with the fourth tone even in Taiwanese Mandarin.

My thoughts are similar to @lukhnos', and MOE dict actually mentions ㄧㄚˇ and ㄜˋ, especially when ㄜˋ is indeed only for 俄國.

xatier commented 1 month ago

My thoughts are similar to @lukhnos', and MOE dict actually mentions ㄧㄚˇ and ㄜˋ, especially when ㄜˋ is indeed only for 俄國.

I believe this page does prefer ㄜˊ for Russia, though.

https://dict.concised.moe.edu.tw/dictView.jsp?ID=39379&q=1

tianjianjiang commented 1 month ago

My thoughts are similar to @lukhnos', and MOE dict actually mentions ㄧㄚˇ and ㄜˋ, especially when ㄜˋ is indeed only for 俄國.

I believe this page does prefer ㄜˊ for Russia, though.

https://dict.concised.moe.edu.tw/dictView.jsp?ID=39379&q=1

I don't disagree. Just saying that it was ㄜˋ, see https://dict.revised.moe.edu.tw/dictView.jsp?ID=10266&q=1&word=%E4%BF%84. I suspect it's the influence from Beijing Mandarin. Another example is 法蘭西 vs. 法 (國). When I was a kid, it was common that transliterations and their single-character aliases could have different tones in Taiwan Mandarin, but not in Beijing Mandarin.

xatier commented 1 month ago

I'm not sure if the MOE is adopting Beijing tones, but same experience for me that all these characters are pronounced differently with what I learnt from schools.

xatier commented 1 month ago

I've updated this PR to include only the additions of ㄜˋ.

Here's the current state of the dictionary:

 $ ./find.py 俄 ㄜˊ
俄亥俄 ㄜˊ ㄏㄞˋ ㄜˊ       
俄人 ㄜˊ ㄖㄣˊ
俄共 ㄜˊ ㄍㄨㄥˋ
俄國 ㄜˊ ㄍㄨㄛˊ
俄國人 ㄜˊ ㄍㄨㄛˊ ㄖㄣˊ
俄式 ㄜˊ ㄕˋ
俄文 ㄜˊ ㄨㄣˊ
俄文系 ㄜˊ ㄨㄣˊ ㄒㄧˋ
俄文組 ㄜˊ ㄨㄣˊ ㄗㄨˇ
俄羅斯 ㄜˊ ㄌㄨㄛˊ ㄙ  
俄羅斯人 ㄜˊ ㄌㄨㄛˊ ㄙ ㄖㄣˊ
俄而 ㄜˊ ㄦˊ                 
俄語 ㄜˊ ㄩˇ                                                                        
俄語系 ㄜˊ ㄩˇ ㄒㄧˋ                  
俄語組 ㄜˊ ㄩˇ ㄗㄨˇ         
俄軍 ㄜˊ ㄐㄩㄣ
俄頃 ㄜˊ ㄑㄧㄥˇ    
帝俄 ㄉㄧˋ ㄜˊ      
帝俄時代 ㄉㄧˋ ㄜˊ ㄕˊ ㄉㄞˋ
懷俄明州 ㄏㄨㄞˊ ㄜˊ ㄇㄧㄥˊ ㄓㄡ
日俄戰爭 ㄖˋ ㄜˊ ㄓㄢˋ ㄓㄥ
沙俄 ㄕㄚ ㄜˊ               

$ ./find.py 俄 ㄜˋ
中俄 ㄓㄨㄥ ㄜˋ
俄亥俄 ㄜˋ ㄏㄞˋ ㄜˋ
俄亥俄州 ㄜˋ ㄏㄞˋ ㄜˋ ㄓㄡ
俄人 ㄜˋ ㄖㄣˊ
俄共 ㄜˋ ㄍㄨㄥˋ
俄國 ㄜˋ ㄍㄨㄛˊ
俄國人 ㄜˋ ㄍㄨㄛˊ ㄖㄣˊ
俄式 ㄜˋ ㄕˋ
俄文 ㄜˋ ㄨㄣˊ
俄文系 ㄜˋ ㄨㄣˊ ㄒㄧˋ
俄文組 ㄜˋ ㄨㄣˊ ㄗㄨˇ
俄新網 ㄜˋ ㄒㄧㄣ ㄨㄤˇ
俄羅斯 ㄜˋ ㄌㄨㄛˊ ㄙ
俄羅斯人 ㄜˋ ㄌㄨㄛˊ ㄙ ㄖㄣˊ
俄羅斯共和國 ㄜˋ ㄌㄨㄛˊ ㄙ ㄍㄨㄥˋ ㄏㄜˊ ㄍㄨㄛˊ
俄羅斯方塊 ㄜˋ ㄌㄨㄛˊ ㄙ ㄈㄤ ㄎㄨㄞˋ
俄羅斯籍 ㄜˋ ㄌㄨㄛˊ ㄙ ㄐㄧˊ
俄語 ㄜˋ ㄩˇ
俄語系 ㄜˋ ㄩˇ ㄒㄧˋ
俄語組 ㄜˋ ㄩˇ ㄗㄨˇ
俄軍 ㄜˋ ㄐㄩㄣ
反共抗俄 ㄈㄢˇ ㄍㄨㄥˋ ㄎㄤˋ ㄜˋ
帝俄 ㄉㄧˋ ㄜˋ
帝俄時代 ㄉㄧˋ ㄜˋ ㄕˊ ㄉㄞˋ
懷俄明州 ㄏㄨㄞˊ ㄜˋ ㄇㄧㄥˊ ㄓㄡ
日俄 ㄖˋ ㄜˋ
日俄戰爭 ㄖˋ ㄜˋ ㄓㄢˋ ㄓㄥ
歐俄 ㄡ ㄜˋ
沙俄 ㄕㄚ ㄜˋ
白俄 ㄅㄞˊ ㄜˋ
白俄羅斯 ㄅㄞˊ ㄜˋ ㄌㄨㄛˊ ㄙ
蘇俄 ㄙㄨ ㄜˋ

We have some ㄜˋ phrases that don't come with ㄜˊ. Let me know if you want me to further update the PR.