libpinyin / ibus-libpinyin

GNU General Public License v3.0
627 stars 90 forks source link

Unexpected first candidate #378

Closed gunnarhj closed 1 year ago

gunnarhj commented 2 years ago

Version: 1.13.0 Distro: Ubuntu 22.10

If I type beijing, the first candidate is just 'beijing', and '北京' is only the second candidate.

ibus-libpinyin_odd-first-candidate

In 1.12.1 '北京' shows up as the first candidate.

ping-wu commented 2 years ago

Initially we do not observe this problem (Debian Sid):

Screenshot from 2022-08-28 16-45-01 Screenshot from 2022-08-28 16-44-26

BUT after a reboot, same problem appeared.

ping-wu commented 2 years ago

Also, this seems to be hard-coded. Even after I repeatedly selected "北京“, "beijing" always comes out first.

On the other hand, entering "bj", "北京" would come out as the fist choice after two tries.

Same thing happens to shanghai (上海), but apparently not to other major cities including 深圳 and 广州。 Their Chinese names always come out as the first choice after entering their pinyin.

Similar to "beijing"/"bj", I can train "shh" to move “上海” to the first choice, but never “shanghai”。

epico commented 2 years ago

Please turn off the English Candidate option in the User data page of the ibus-libpinyin setup dialog.

ping-wu commented 2 years ago

Please turn off the English Candidate option in the User data page of the ibus-libpinyin setup dialog.

Actually this is a pretty handy option, really there is no need to turn off. I would also like to thank Gunnar for "discovering" this hidden treasure. Pretty neat. Please see the following screenshot: Screenshot from 2022-08-28 17-24-12

In fact, as a native Chinese user, I don't think I would ever input the whole pinyin "beijing" to type out “北京”. Always "bj". But I would appreciate if the first letter in Beijing and Shanghai can be capitalized.

Again, many thanks to Gunnar for discovering this hidden treasure.

gunnarhj commented 2 years ago

Are you saying "not a bug, it's a feature"? :)

Yes, I can train it to show 北京 first when typing bj and 上海 when typing shh. And I can get rid of English suggestions by unchecking "English Candidate".

I see now that "English Candidate" is an option in a new "Input Modes" group of options, and that it was intruduced in commit 276c9439. So, if I understand it correctly, the changes give the users better possibilities to control the behavior. This sounds indeed as an improvement.

Thanks for clarifying! Leaving it to you to decide when this issue should be closed. Possibly it may serve as a reminder for some kind of further improvement.

ping-wu commented 2 years ago

Are you saying "not a bug, it's a feature"? :)

(Definitely a) Yes! Since this option can be turned off, it is definitely not a bug. And since it provides great conveniences for commingled English/Chinese writings, it is definitely a feature. An improvement!

I'm sure your knowledge of Chinese is much better than my knowledge of Swedish (which is zero), but for most native Chinese speakers, when we use the pinyin class of input methods, we almost only use combination of consonants to output words (词)or even an entire sentence (this makes Chinese pinyin a very fast input method). A Chinese so-called "word" typically comprises a plurality of Chinese characters (字). I guess only elementary school students would spell out the entire pinyin such as “beijing”, to output “北京”. But they are doing that mainly to learn pinyin (i.e., the correct pinyin of Chinese characters).

But there are definitely rooms for improvement, especially since this is only the first step in officially using an English dictionary for the ibus-libpinyin input engine.

ping-wu commented 2 years ago

Version: 1.13.0 Distro: Ubuntu 22.10

Forgot to say thanks Gunnar, for adding ibus-libpinyin 1.13.0 to the Ubuntu/Debian repositories. Debian Sid is a rolling release, but Ubuntu 22.04LTS users may need to upgrade to 22.10.

gunnarhj commented 2 years ago

Also, this seems to be hard-coded. Even after I repeatedly selected "北京“, "beijing" always comes out first. ... I can train "shh" to move “上海” to the first choice, but never “shanghai”。

Is that because beijing and shanghai are present in data/wordlist?

epico commented 2 years ago

Thanks for the report!

I think we can remove "beijing", "shanghai" and other words from the data/wordlist file to fix this issue.

Or we can move the English candidates after the sentence candidates. The English candidate can appear in the 2-4 candidate position.

ping-wu commented 2 years ago

I would wait until more user inputs are collected. But at the present time, is there any way to edit the English dictionary?

Screenshot from 2022-08-30 07-17-27

ping-wu commented 2 years ago

Don't know what I did but typing beijing “北京” becomes the first candidate. Ditto shanghai:

Screenshot from 2022-08-30 13-14-54

Debian Sid; ibus-libpinyin v. 1.13.0

ping-wu commented 2 years ago

is there any way to edit the English dictionary?

One way to edit the English dictionary is via the "v" operator. In the following example, I have added Beijing (first letter capitalized) to the user dictionary: Screenshot from 2022-08-30 13-21-25

ping-wu commented 2 years ago

Further, some observations:

  1. "beijing" is not an appropriate word, unless its first letter is capitalized. Thus, this is a bug, and it should be removed from the dictionary. Ditto "shanghai".

  2. As shown in the attached screenshot, typing the correct spelling, "Beijing" (or simply the letter "B"), "Beijing" will appear as the first candidate.

  3. Even for esoteric words (such as "Hjalmarson" :smiley: ), it's easy to add them to the ibus-libpinyin dictionary (only one training is required). (Houston is my home town, I am using it so often it's difficult to knock it down from the first candidate. )

  4. I can even add "LibreOffice" to the dictionary; it's been a pain in the 8th to type "LibreOffice" with a capital O in the middle.

Screenshot from 2022-09-01 06-52-59

epico commented 1 year ago

I moved the English candidates after the sentence candidates in ibus-libpinyin 1.15.0 .