apertium / apertium

Core tools (driver script, transfer, tagger, formatters) for the FOSS RBMT system Apertium
https://apertium.org/
GNU General Public License v2.0
88 stars 25 forks source link

Perceptron tagger broken on latest Apertium (3.8.1) #165

Open marcriera opened 2 years ago

marcriera commented 2 years ago

The perceptron tagger does not run on the latest version of apertium-tagger. This is from apertium-eng:

$ echo "^house/house<n><sg>$" | apertium-tagger -gx eng.prob 
/usr/include/c++/11.2.0/bits/stl_vector.h:1045: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](std::vector<_Tp, _Alloc>::size_type) [with _Tp = std::pair<int, int>; _Alloc = std::allocator<std::pair<int, int> >; std::vector<_Tp, _Alloc>::reference = std::pair<int, int>&; std::vector<_Tp, _Alloc>::size_type = long unsigned int]: Assertion '__n < this->size()' failed.
Avortat (s'ha bolcat la memòria)

This is on Arch Linux with everything up to date and lttoolbox and apertium built from the latest stable.

mr-martian commented 2 years ago

Probably related to #63

Naturally, it works fine on Linux Mint (lttoolbox and apertium both on most recent master commit) because no tagger bug is allowed to be simple.

TinoDidriksen commented 2 years ago

Looks related, but a crash is at least debuggable. The other issue just had bad or no output, which is much harder to figure out. This should be much easier to figure out.

TinoDidriksen commented 2 years ago

However, works for me, on fully updated Arch Linux x86_64 in Docker with everything from git:

$ echo "^house/house<n><sg>$" | apertium-tagger -gx eng.prob
^house<n><sg>$

So, will need more information.

marcriera commented 2 years ago

I think I've found the issue.

I've rebuilt lttoolbox and apertium from source and the tagger now works as expected (like your tests, @TinoDidriksen). However, if apertium is built via makepkg, which happened to be the case because I used a PKGBUILD from the AUR a couple of weeks ago, it fails. No issue during compilation, only on runtime, apparently with the same build process.

It turns out makepkg sets -D_GLIBCXX_ASSERTIONS by default on Arch and that causes the issue. Building apertium directly from source doesn't enable these extra assertions, they are not checked and the tagger just works. When they are enabled, the issue happens.

TinoDidriksen commented 2 years ago

Seems that Arch and Fedora both build with _GLIBCXX_ASSERTIONS enabled by default, so we should definitely routinely test with that, or even enable it in our builds.

mr-martian commented 1 year ago

I have added -D_GLIBCXX_ASSERTIONS to my local build (and verified that I can produce crashes from assertions) but I'm still not able to reproduce this one.

marcriera commented 1 year ago

I have added -D_GLIBCXX_ASSERTIONS to my local build and I cannot reproduce it either.

However, I can reproduce the error if I manually modify the PKGBUILD from the AUR apertium package so it points to the most recent commit and then build it with makepkg. I have no idea why it happens, as I have even tried editing /etc/makepkg.conf to remove the flag from the default config, yet it makes no difference. I have no local configs that could be overriding the flag.