fcitx / libime

42 stars 21 forks source link

make test failure on s390x (may related to libime or KenLM) #9

Closed karuboniru closed 4 years ago

karuboniru commented 4 years ago

fcitx5-chinese-addons build fails on fedora koji due to a test error, the build log can be found here.

This seems to be happening also on Debian build.

+ /usr/bin/ctest --output-on-failure --force-new-ctest-process -j3 --verbose --extra-verbose
UpdateCTestConfiguration  from :/builddir/build/BUILD/fcitx5-chinese-addons-ef9beb76e563cae1da7eb9cdea4ce0d916e2e700/s390x-redhat-linux-gnu/DartConfiguration.tcl
UpdateCTestConfiguration  from :/builddir/build/BUILD/fcitx5-chinese-addons-ef9beb76e563cae1da7eb9cdea4ce0d916e2e700/s390x-redhat-linux-gnu/DartConfiguration.tcl
Test project /builddir/build/BUILD/fcitx5-chinese-addons-ef9beb76e563cae1da7eb9cdea4ce0d916e2e700/s390x-redhat-linux-gnu
Constructing a list of tests
Done constructing a list of tests
Updating test list for fixtures
Added 0 tests to meet fixture requirements
Checking test dependency graph...
Checking test dependency graph end
test 1
    Start 1: testpunctuation
1: Test command: /builddir/build/BUILD/fcitx5-chinese-addons-ef9beb76e563cae1da7eb9cdea4ce0d916e2e700/s390x-redhat-linux-gnu/test/testpunctuation
1: Test timeout computed to be: 10000000
test 2
    Start 2: testpinyinhelper
2: Test command: /builddir/build/BUILD/fcitx5-chinese-addons-ef9beb76e563cae1da7eb9cdea4ce0d916e2e700/s390x-redhat-linux-gnu/test/testpinyinhelper
2: Test timeout computed to be: 10000000
test 3
    Start 3: testpinyin
3: Test command: /builddir/build/BUILD/fcitx5-chinese-addons-ef9beb76e563cae1da7eb9cdea4ce0d916e2e700/s390x-redhat-linux-gnu/test/testpinyin
3: Test timeout computed to be: 10000000
1: I2020-08-25 13:46:33.055770 addonmanager.cpp:177] Loaded addon punctuation
1: I2020-08-25 13:46:33.055804 addonmanager.cpp:271] Unloading addon punctuation
1/3 Test fcitx/fcitx5-chinese-addons#1: testpunctuation ..................   Passed    0.00 sec
3: D2020-08-25 13:46:33.056702 i18n.cpp:32] Add gettext domain fcitx5 at /usr/share/locale
3: D2020-08-25 13:46:33.064639 instance.cpp:1479] Trigger Key: Control+space Zenkaku_Hankaku Hangul
3: I2020-08-25 13:46:33.064729 instance.cpp:1176] Override Enabled Addons: {}
3: I2020-08-25 13:46:33.064802 instance.cpp:1177] Override Disabled Addons: {}
3: D2020-08-25 13:46:33.065219 addonmanager.cpp:143] Call loadAddon() with testim checkDependencies() returns 0 Dep: [] OptDep: []
3: I2020-08-25 13:46:33.065480 addonmanager.cpp:177] Loaded addon testim
3: D2020-08-25 13:46:33.065558 addonmanager.cpp:143] Call loadAddon() with testfrontend checkDependencies() returns 0 Dep: [] OptDep: []
3: I2020-08-25 13:46:33.065799 addonmanager.cpp:177] Loaded addon testfrontend
3: D2020-08-25 13:46:33.065877 addonmanager.cpp:143] Call loadAddon() with testui checkDependencies() returns 0 Dep: [] OptDep: []
3: I2020-08-25 13:46:33.066102 addonmanager.cpp:177] Loaded addon testui
3: D2020-08-25 13:46:33.066216 addonmanager.cpp:143] Call loadAddon() with pinyinhelper checkDependencies() returns 0 Dep: [] OptDep: [quickphrase, clipboard]
3: I2020-08-25 13:46:33.308261 addonmanager.cpp:177] Loaded addon pinyinhelper
3: I2020-08-25 13:46:33.308572 inputmethodmanager.cpp:195] Found 3 input method(s) in addon testim
3: I2020-08-25 13:46:33.308659 inputmethodmanager.cpp:109] No valid input method group in configuration. Building a default one
3: I2020-08-25 13:46:33.308752 instance.cpp:563] Items in Default: [InputMethodGroupItem(keyboard-us,layout=)]
3: I2020-08-25 13:46:33.308843 instance.cpp:568] Generated groups: [Default]
3: D2020-08-25 13:46:33.309014 addonmanager.cpp:143] Call loadAddon() with pinyin checkDependencies() returns 2 Dep: [punctuation] OptDep: [fullwidth, quickphrase, cloudpinyin, notifications, spell, pinyinhelper, chttrans, imeapi]
3: D2020-08-25 13:46:33.309201 addonmanager.cpp:143] Call loadAddon() with punctuation checkDependencies() returns 0 Dep: [] OptDep: [notifications]
3: D2020-08-25 13:46:33.309586 i18n.cpp:32] Add gettext domain fcitx5-chinese-addons at /usr/share/locale
3: I2020-08-25 13:46:33.309724 addonmanager.cpp:177] Loaded addon punctuation
3: D2020-08-25 13:46:33.309789 addonmanager.cpp:143] Call loadAddon() with pinyin checkDependencies() returns 0 Dep: [punctuation] OptDep: [fullwidth, quickphrase, cloudpinyin, notifications, spell, pinyinhelper, chttrans, imeapi]
3: E2020-08-25 13:46:33.310716 addonloader.cpp:57] Failed to create addon: pinyin ../src/libime/core/kenlm/lm/binary_format.cc:112 in bool lm::ngram::IsBinaryFormat(int) threw FormatLoadException.
3: File looks like it should be loaded with mmap, but the test values don't match.  Try rebuilding the binary format LM using the same code revision, compiler, and architecture
3: F2020-08-25 13:46:33.312008 testpinyin.cpp:23] pinyin failed
2: I2020-08-25 13:46:33.330771 addonmanager.cpp:177] Loaded addon pinyinhelper
2: I2020-08-25 13:46:33.330945 testpinyinhelper.cpp:26] nǐ 
2: I2020-08-25 13:46:33.330977 testpinyinhelper.cpp:32] 冃 丨𠃍一一
2: I2020-08-25 13:46:33.330988 testpinyinhelper.cpp:32] 口 丨𠃍一
2: I2020-08-25 13:46:33.330998 testpinyinhelper.cpp:32] 𠮙 丨𠃍一𠃍
2: I2020-08-25 13:46:33.331024 addonmanager.cpp:271] Unloading addon pinyinhelper
2/3 Test fcitx/fcitx5-chinese-addons#2: testpinyinhelper .................   Passed    0.29 sec
3/3 Test fcitx/fcitx5-chinese-addons#3: testpinyin .......................Child aborted***Exception:   1.02 sec
D2020-08-25 13:46:33.056702 i18n.cpp:32] Add gettext domain fcitx5 at /usr/share/locale
D2020-08-25 13:46:33.064639 instance.cpp:1479] Trigger Key: Control+space Zenkaku_Hankaku Hangul
I2020-08-25 13:46:33.064729 instance.cpp:1176] Override Enabled Addons: {}
I2020-08-25 13:46:33.064802 instance.cpp:1177] Override Disabled Addons: {}
D2020-08-25 13:46:33.065219 addonmanager.cpp:143] Call loadAddon() with testim checkDependencies() returns 0 Dep: [] OptDep: []
I2020-08-25 13:46:33.065480 addonmanager.cpp:177] Loaded addon testim
D2020-08-25 13:46:33.065558 addonmanager.cpp:143] Call loadAddon() with testfrontend checkDependencies() returns 0 Dep: [] OptDep: []
I2020-08-25 13:46:33.065799 addonmanager.cpp:177] Loaded addon testfrontend
D2020-08-25 13:46:33.065877 addonmanager.cpp:143] Call loadAddon() with testui checkDependencies() returns 0 Dep: [] OptDep: []
I2020-08-25 13:46:33.066102 addonmanager.cpp:177] Loaded addon testui
D2020-08-25 13:46:33.066216 addonmanager.cpp:143] Call loadAddon() with pinyinhelper checkDependencies() returns 0 Dep: [] OptDep: [quickphrase, clipboard]
I2020-08-25 13:46:33.308261 addonmanager.cpp:177] Loaded addon pinyinhelper
I2020-08-25 13:46:33.308572 inputmethodmanager.cpp:195] Found 3 input method(s) in addon testim
I2020-08-25 13:46:33.308659 inputmethodmanager.cpp:109] No valid input method group in configuration. Building a default one
I2020-08-25 13:46:33.308752 instance.cpp:563] Items in Default: [InputMethodGroupItem(keyboard-us,layout=)]
I2020-08-25 13:46:33.308843 instance.cpp:568] Generated groups: [Default]
D2020-08-25 13:46:33.309014 addonmanager.cpp:143] Call loadAddon() with pinyin checkDependencies() returns 2 Dep: [punctuation] OptDep: [fullwidth, quickphrase, cloudpinyin, notifications, spell, pinyinhelper, chttrans, imeapi]
D2020-08-25 13:46:33.309201 addonmanager.cpp:143] Call loadAddon() with punctuation checkDependencies() returns 0 Dep: [] OptDep: [notifications]
D2020-08-25 13:46:33.309586 i18n.cpp:32] Add gettext domain fcitx5-chinese-addons at /usr/share/locale
I2020-08-25 13:46:33.309724 addonmanager.cpp:177] Loaded addon punctuation
D2020-08-25 13:46:33.309789 addonmanager.cpp:143] Call loadAddon() with pinyin checkDependencies() returns 0 Dep: [punctuation] OptDep: [fullwidth, quickphrase, cloudpinyin, notifications, spell, pinyinhelper, chttrans, imeapi]
E2020-08-25 13:46:33.310716 addonloader.cpp:57] Failed to create addon: pinyin ../src/libime/core/kenlm/lm/binary_format.cc:112 in bool lm::ngram::IsBinaryFormat(int) threw FormatLoadException.
File looks like it should be loaded with mmap, but the test values don't match.  Try rebuilding the binary format LM using the same code revision, compiler, and architecture
F2020-08-25 13:46:33.312008 testpinyin.cpp:23] pinyin failed
67% tests passed, 1 tests failed out of 3
Total Test time (real) =   1.02 sec
The following tests FAILED:
      3 - testpinyin (Child aborted)
Errors while running CTest
wengxt commented 4 years ago

So one of the issue here is that libime-data should not be noarch, it generates different binary data on big endian and little endian arch. kenlm data will be loaded via mmap so we can't simply generate same binary data on different be and le platform.

Still yet to confirm if other part (e.g. datrie) data has the same endianess problem.

karuboniru commented 4 years ago

Yes, those data files have different checksums, by merging libime-data into libime, the failure goes away!

Thanks for your help!

wengxt commented 4 years ago

Leave it open until all the endian and arch issue is fully investigated.

Right now we have following observations:

  1. different data being generated on different arch, possibly related to loss of precision in float arithmetic.
  2. big endian's data may have more problem (invalid when load) as shown by build log.
  3. arch dependent data should be moved to /usr/lib instead of /use/share.
karuboniru commented 4 years ago

Right now we have following observations:

  1. different data being generated on different arch, possibly related to loss of precision in float arithmetic.

Didn't notice that problem, take /usr/share/libime/zh_CN.lm for example, build on

generates exact checksum, and different than that of s390x. Those are all fedora koji supported arches. Seems that is only endian related.

  1. big endian's data may have more problem (invalid when load) as shown by build log.

So, those data that is loaded by mmap should be considered as arch (or endian) specific? Use big endian's data to build on s390x reported no problem. (see build log)

While it will be hard to debug arch specific things, since it not easy to get a, for example, s390x machine to see if this is really working.

  1. arch dependent data should be moved to /usr/lib instead of /use/share.

Yes, this is trivial to fix and it is good to compliance to FHS. 👍

wengxt commented 4 years ago

The fix to kenlm upstream: https://github.com/kpu/kenlm/pull/293 Other than that data seems to be ok now.

Work items remaining: Move lm data to usr/lib, but datrie part can be remain same . Though due to the float computation, the data generated might be different due to the float point calculation or hash difference on different platform, but the data itself remain valid and can be used on different endian platform.

wengxt commented 4 years ago

fixed by af7337c