rime / librime

Rime Input Method Engine, the core library
https://rime.im
BSD 3-Clause "New" or "Revised" License
3.37k stars 551 forks source link

'b' can be translated by script_translator to any output from table_translator #610

Closed ksqsf closed 1 year ago

ksqsf commented 1 year ago

Describe the bug "b" can be translated by script_translator to any output from table_translator, if:

  1. a schema uses a script_translator and a table_translator simultaneously, with the script_translator being the main translator;
  2. the table_translator uses a dictionary that shares the same namespace as the dictionary of the table_translator.
There is a minimal schema that can be used to reproduce this issue. ```yaml # foo.schema.yaml schema: schema_id: foo dependencies: - foo.fixed switches: - name: ascii_mode reset: 0 states: [ 中文, 西文 ] - name: full_shape states: [ 半角, 全角 ] - name: simplification reset: 1 states: [ 漢字, 汉字 ] - name: ascii_punct states: [ 。,, ., ] engine: processors: - ascii_composer - recognizer - key_binder - speller - punctuator - selector - navigator - express_editor segmentors: - ascii_segmentor - matcher - abc_segmentor - punct_segmentor - fallback_segmentor translators: - punct_translator - reverse_lookup_translator - table_translator@fixed - script_translator filters: - simplifier - uniquifier speller: alphabet: abcdefghijklmnopqrstuvwxyz delimiter: " '" translator: dictionary: foo.main prism: foo initial_quality: 0 fixed: dictionary: foo.fixed initial_quality: 5 enable_user_dict: false enable_completion: false enable_sentence: false enable_encoder: false encode_commit_history: false punctuator: import_preset: symbols key_binder: import_preset: default recognizer: import_preset: default ``` ```yaml # foo.main.dict.yaml --- name: foo.main version: "1" sort: by_weight use_preset_vocabulary: true ... 不 bu 八 ba 表 biao 本 ben 帮 bang ``` ```yaml # foo.fixed.schema.yaml schema: schema_id: foo.fixed translator: dictionary: foo.fixed enable_user_dict: false ``` ```yaml # foo.fixed.dict.yaml --- name: foo.fixed version: "1" sort: original columns: - code - text ... wsm 为什么 sm 什么 ```

Initially, "b" only finds

image

However, after typing "wsm" and committing "为什么", now "b" finds

image

This is totally unexpected because b → 为什么 is not in foo.main nor foo.fixed. Now type "sm" and commit "什么" twice, and then "b" finds:

image

This indicates that "b" finds the history of the outputs of the table translator. Or, whenever the table translator commits, the frequency info is incorrectly updated for "b".

To Reproduce Steps to reproduce the bug:

  1. Install the schema above
  2. Type "b" and commit nothing to confirm the outputs of the script translator.
  3. Type "wsm" and commit. Press "b" to see the results.
  4. Type "sm" and commit twice. Press "b" to see the results.

Expected behavior "b" should not find any outputs of the table translator.

I can work around this issue by renaming foo.fixed → foo_fixed. I don't know why but it works.

Log n.a.

Screenshots See above.

Flavor(please complete the following information): Select your flavor:

Package:

Additional context Only "b" has this issue. Any other letter works fine.

mokapsing commented 1 year ago

could you please provide schema file?

ksqsf commented 1 year ago

@mokapsing It is provided in the issue report. See "There is a minimal schema that can be used to reproduce this issue."

mokapsing commented 1 year ago

sorry, I did not notice

lotem commented 1 year ago

程序裏沒有這樣的BUG。不能把自定義配置的問題當作本軟件的BUG。你得自己調試配置直到排除錯誤。這個軟件能夠有用的關鍵是把資源用在解決大衆需要的問題上而不用來調查特例。

ksqsf commented 1 year ago

我已经排除错误了:把 foo.fixed 改名为 foo_fixed 就好了。(在 Expected behavior 一节中提到了)

irreproducible 是指在别的平台上无法复现么?

lotem commented 1 year ago

非但無法復現,甚至部署就會出錯。

ksqsf commented 1 year ago

抱歉,我在回报的时候误把 foo.fixed.dict.yaml 中的 version 选项删除了,修复后就可以部署成功,并且我这里仍然可以复现这个问题,即使 ~/Library/Rime 只有这些文件。

# foo.fixed.dict.yaml
---
name: foo.fixed
version: "1"     # ← add
sort: original
columns:
  - code
  - text
...

wsm 为什么
sm  什么
ksqsf commented 1 year ago

我在 Windows 下使用小狼毫 0.14.3 也可以复现该问题。手动将 librime 更新到 1.8.5 (已经清理了所有 build 产生的文件并重新启动)也有该问题:

image

因此我可以确认该问题(使用issue中提供的schema)在两个平台下都可以复现,并且 Windows 是干净的、全新安装的虚拟机。希望以上信息对 @lotem 有帮助。

不过在此之前我不清楚 foo.xx 和 foo.yy 在构建词典的时候会发生什么。我翻阅了文档,没有看到有关带 . 的词典名、方案名会如何处理:我在写方案的时候参考了一些别的方案,他们使用了带 . 的方案名,所以我以为这样做是受支持的。如果不支持的话(出现任何错误都是预期行为),建议在文档中写明,帮助后来人节省时间。谢谢!

lotem commented 1 year ago

支持的。就是你看到的效果。 別人想要達到自造詞在詞典之間流通效果,所以這樣命名。如果不想要這樣的效果,就別這樣命名。

ksqsf commented 1 year ago

谢谢回答,如果一开始就告诉我这些就好了!希望文档可以更新 :)

lotem commented 1 year ago

一開始的配置直接編譯出錯。我哪知道後面還有有什麼問題。

文檔裏面講得清楚:

方案標識由小寫字母、數字、下劃線構成。 詞典名,內部使用,命名原則同「方案標識」;可以與配套的輸入方案標識一致,也可不同;

如果你按照文檔書寫配置,就不會遇到這個問題了。