apertium / apertium-recursive

Recursive structural transfer module for Apertium
https://wiki.apertium.org/wiki/Apertium-recursive
GNU General Public License v3.0
6 stars 4 forks source link

segfault in sme-deu #98

Open flammie opened 9 months ago

flammie commented 9 months ago

I just started testing sme-deu on larger corpus material and ran into a segfault, which is silently hidden when using apertium command... which is very bad for debugging use cause it seems like it finished succesfully.

$  echo Dat ášši lea okta deháleamos diggeáššiin mii guoská boazodoallorivttiide guovlluin gos ollu meahcceeatnamiid eaiggáduššet priváhtaolbmot. Dakkár guovlluin šaddá ge veadjemeahttun doaimmahit boazodoalu jus Norggas šaddá dábálaš riekteáddejupmin ahte bohccuid ii leat lohpi guođohit eatnamiin maid eaiggádit eará olbmot. Danne lea ge máilmmi dehálaš ahte Iinnasuolu-duopmu isko alitrivttiin, go muđui šaddet eambbo dakkár áššit boazodoalu vuostá jus dát duopmu šaddá dábálaš riektedáhpin Norggas. ¶ | hfst-proc -w -e '/home/flammie/github/apertium/apertium-sme-deu/sme-deu.automorf.hfst'  | cg-proc -w '/home/flammie/github/apertium/apertium-sme-deu/sme-deu.mor.rlx.bin' | cg-proc -w '/home/flammie/github/apertium/apertium-sme-deu/sme-deu.syn.rlx.bin' |  cg-proc -n -1 -w '/home/flammie/github/apertium/apertium-sme-deu/sme-deu.dep.rlx.bin'  | sed -e 's/<#[^>]*>//g'  | apertium-pretransfer | lsx-proc '/home/flammie/github/apertium/apertium-sme-deu/sme-deu.autoseq.bin'  | lt-proc -b '/home/flammie/github/apertium/apertium-sme-deu/sme-deu.autobil.bin' | cg-proc '/home/flammie/github/apertium/apertium-sme-deu/sme-deu.lex.bin' | lrx-proc -m '/home/flammie/github/apertium/apertium-sme-deu/sme-deu.autolex.bin' | apertium-anaphora '/home/flammie/github/apertium/apertium-sme-deu/apertium-sme-deu.sme-deu.arx' | rtx-proc '/home/flammie/github/apertium/apertium-sme-deu/sme-deu.rtx.bin' 
Minnesegmentsfeil (kjerne lagret i fil)

$ echo Dat ášši lea okta deháleamos diggeáššiin mii guoská boazodoallorivttiide guovlluin gos ollu meahcceeatnamiid eaiggáduššet priváhtaolbmot. Dakkár guovlluin šaddá ge veadjemeahttun doaimmahit boazodoalu jus Norggas šaddá dábálaš riekteáddejupmin ahte bohccuid ii leat lohpi guođohit eatnamiin maid eaiggádit eará olbmot. Danne lea ge máilmmi dehálaš ahte Iinnasuolu-duopmu isko alitrivttiin, go muđui šaddet eambbo dakkár áššit boazodoalu vuostá jus dát duopmu šaddá dábálaš riektedáhpin Norggas. ¶ | apertium -d . sme-deu
$

Zero output and the return code of apertium command is 0 (EXIT_SUCCESS)!

here's the output of previous step:

echo Dat ášši lea okta deháleamos diggeáššiin mii guoská boazodoallorivttiide guovlluin gos ollu meahcceeatnamiid eaiggáduššet priváhtaolbmot. Dakkár guovlluin šaddá ge veadjemeahttun doaimmahit boazodoalu jus Norggas šaddá dábálaš riekteáddejupmin ahte bohccuid ii leat lohpi guođohit eatnamiin maid eaiggádit eará olbmot. Danne lea ge máilmmi dehálaš ahte Iinnasuolu-duopmu isko alitrivttiin, go muđui šaddet eambbo dakkár áššit boazodoalu vuostá jus dát duopmu šaddá dábálaš riektedáhpin Norggas. ¶ | hfst-proc -w -e '/home/flammie/github/apertium/apertium-sme-deu/sme-deu.automorf.hfst'  | cg-proc -w '/home/flammie/github/apertium/apertium-sme-deu/sme-deu.mor.rlx.bin' | cg-proc -w '/home/flammie/github/apertium/apertium-sme-deu/sme-deu.syn.rlx.bin' |  cg-proc -n -1 -w '/home/flammie/github/apertium/apertium-sme-deu/sme-deu.dep.rlx.bin'  | sed -e 's/<#[^>]*>//g'  | apertium-pretransfer | lsx-proc '/home/flammie/github/apertium/apertium-sme-deu/sme-deu.autoseq.bin'  | lt-proc -b '/home/flammie/github/apertium/apertium-sme-deu/sme-deu.autobil.bin' | cg-proc '/home/flammie/github/apertium/apertium-sme-deu/sme-deu.lex.bin' | lrx-proc -m '/home/flammie/github/apertium/apertium-sme-deu/sme-deu.autolex.bin' | apertium-anaphora '/home/flammie/github/apertium/apertium-sme-deu/apertium-sme-deu.sme-deu.arx'
^Dat<prn><dem><sg><nom><@→N>/Der<prn><dem><dem><sg><nom><@→N>/Er<prn><pers><p3><dem><sg><nom><@→N>/Es<prn><pers><p3><nt><dem><sg><nom><@→N>/$ ^ášši<n><sem_semcon><sg><nom><@SUBJ→>/Sache<n><f><sem_semcon><sg><nom><@SUBJ→>/$ ^leat<vblex><iv><indic><pres><p3><sg><@mv>/sein<vbser><iv><indic><pres><p3><sg><@mv>/haben<vbhaver><iv><indic><pres><p3><sg><@mv>/$ ^okta<num><sg><nom><@←SPRED>/eins<num><sg><nom><@←SPRED>/$ ^dehálaš<adj><der_superl><adj><attr><@→N>/wichtig<adj><sup><attr><@→N>/$ ^diggi<n><cmp_sgnom><cmp>/Parlament<n><nt><cmp_sgnom><cmp>/$ ^ášši<n><sem_semcon><sg><com><@←ADVL>/Sache<n><f><sem_semcon><sg><com><@←ADVL>/$ ^mii<prn><rel><sg><nom><@SUBJ→>/der<prn><rel><sg><nom><@SUBJ→>/welcher<prn><rel><sp><nom><sg><nom><@SUBJ→>/$ ^guoskat<vblex><iv><indic><pres><p3><sg><@mv>/berühren<vblex><iv><indic><pres><p3><sg><@mv>/$ ^boazodoallu<n><sem_domain><cmp_sgnom><cmp>/Rentierhaltung<n><f><sem_domain><cmp_sgnom><cmp>/$ ^rikti<n><sem_plc><pl><ill><@←ADVL>/Fuchs<n><m><sem_plc><pl><ill><@←ADVL>/$ ^guovlu<n><sem_plc><pl><loc><@←ADVL-ine>/Gebiet<n><nt><sem_plc><pl><loc><@←ADVL-ine>/Bereich<n><m><sem_plc><pl><loc><@←ADVL-ine>/$ ^gos<adv><@ADVLcs-ine→>/wo<adv><@ADVLcs-ine→>/$ ^ollu<adj><attr><@→N>/viel<adj><attr><@→N>/$ ^meahcci<n><sem_plc><cmp_sgnom><cmp>/Wald<n><m><sem_plc><cmp_sgnom><cmp>/$ ^eana<n><sem_plc><pl><acc><@OBJ→>/Erde<n><m><sem_plc><pl><acc><@OBJ→>/$ ^eaiggáduššat<vblex><tv><indic><pres><p3><pl><@mv>/besitzen<vblex><tv><indic><pres><p3><pl><@mv>/$ ^priváhta<n><sem_dummytag><cmp_sgnom><cmp>/Privat<n><m><sem_dummytag><cmp_sgnom><cmp>/$ ^olmmoš<n><sem_hum><pl><nom><@←SUBJ>/Mensch<n><m><sem_hum><pl><nom><@←SUBJ>/$^.<sent>/.<sent>/$ ^Dakkár<prn><dem><attr><@→N>/Solche<prn><dem><attr><@→N>/$ ^guovlu<n><sem_plc><pl><loc><@ADVL-ine→>/Gebiet<n><nt><sem_plc><pl><loc><@ADVL-ine→>/Bereich<n><m><sem_plc><pl><loc><@ADVL-ine→>/$ ^šaddat<vblex><iv><indic><pres><p3><sg><@mv>/werden<vblex><iv><indic><pres><p3><sg><@mv>/wachsen<vblex><iv><indic><pres><p3><sg><@mv>/$ ^ge<pcle><@PCLE>/<@PCLE>/$ ^veadjemeahttun<adj><sg><nom><@←SPRED>/unmöglich<adj><sg><nom><@←SPRED>/$ ^doaimmahit<vblex><tv><inf><@←SUBJ>/herausfordern<vblex><tv><inf><@←SUBJ>/$ ^boazodoallu<n><sem_domain><sg><acc><@-F←OBJ>/Rentierhaltung<n><f><sem_domain><sg><acc><@-F←OBJ>/$ ^jus<cnjsub><@CVP>/wenn<cnjsub><@CVP>/ob<cnjsub><@CVP>/$ ^Norga<np><top><sg><loc><@ADVL-ine→>/Norwegen<np><top><sg><loc><@ADVL-ine→>/$ ^šaddat<vblex><iv><indic><pres><p3><sg><@mv>/werden<vblex><iv><indic><pres><p3><sg><@mv>/wachsen<vblex><iv><indic><pres><p3><sg><@mv>/$ ^dábálaš<adj><sg><nom><@←SUBJ>/typisch<adj><sg><nom><@←SUBJ>/$ ^riekti<n><sem_org_rule><cmp_sgnom><cmp>/Recht<n><nt><sem_org_rule><cmp_sgnom><cmp>/$ ^áddejupmi<n><sem_prod-cogn><ess><@←SPRED>/Verständnis<n><m><sem_prod-cogn><ess><@←SPRED>/Verständnis<n><nt><sem_prod-cogn><ess><@←SPRED>/$ ^ahte<cnjsub><@CVP>/dass<cnjsub><@CVP>/$ ^boazu<n><sem_ani><pl><acc><@OBJ→>/Rentier<n><nt><sem_ani><pl><acc><@OBJ→>/$ ^ii<vblex><iv><neg><indic><p3><sg><@aux>/nein<adv><iv><neg><indic><p3><sg><@aux>/nicht<adv><iv><neg><indic><p3><sg><@aux>/$ ^leat<vblex><iv><indic><pres><conneg><@mv>/sein<vbser><iv><indic><pres><conneg><@mv>/haben<vbhaver><iv><indic><pres><conneg><@mv>/$ ^lohpi<n><sem_time><sg><nom><@←SPRED>/Erlaubnis<n><f><sem_time><sg><nom><@←SPRED>/$ ^guođohit<vblex><tv><inf><@N←>/behüten<vblex><tv><inf><@N←>/$ ^eana<n><sem_plc><pl><loc><@-F←ADVL-ine>/Erde<n><m><sem_plc><pl><loc><@-F←ADVL-ine>/$ ^mii<prn><rel><pl><acc><@OBJ→>/der<prn><rel><pl><acc><@OBJ→>/welcher<prn><rel><sp><nom><pl><acc><@OBJ→>/$ ^eaiggádit<vblex><tv><indic><pres><p3><pl><@mv>/haben<vblex><tv><indic><pres><p3><pl><@mv>/$ ^eará<prn><ind><attr><@→N>/andere<prn><ind><attr><@→N>/$ ^olmmoš<n><sem_hum><pl><nom><@←SUBJ>/Mensch<n><m><sem_hum><pl><nom><@←SUBJ>/$^.<sent>/.<sent>/$ ^Danne<adv><@ADVL-ine→>/Deshalb<adv><@ADVL-ine→>/$ ^leat<vblex><iv><indic><pres><p3><sg><@mv>/sein<vbser><iv><indic><pres><p3><sg><@mv>/haben<vbhaver><iv><indic><pres><p3><sg><@mv>/$ ^ge<pcle><@PCLE>/<@PCLE>/$ ^máilbmi<n><sem_plc><sg><gen><@→A>/Welt<n><f><sem_plc><sg><gen><@→A>/$ ^dehálaš<adj><sg><nom><@←SPRED>/wichtig<adj><sg><nom><@←SPRED>/$ ^ahte<cnjsub><@CVP>/dass<cnjsub><@CVP>/$ ^*Iinnasuolu/*Iinnasuolu/$^-<guio>/-<guio>/$^duopmu<n><sem_prod><sg><nom><@SUBJ→>/Urteil<n><nt><sem_prod><sg><nom><@SUBJ→>/$ ^iskat<vblex><tv><der_passs><vblex><iv><indic><pres><p3><sg><@mv>/forschen<vblex><tv><der_passs><vblex><iv><indic><pres><p3><sg><@mv>/$ ^alit<adj><sem_dummytag><cmp_attr><cmp>/blau<adj><sem_dummytag><cmp_attr><cmp>/$ ^rikti<n><sem_plc><pl><loc><@←ADVL-ine>/Fuchs<n><m><sem_plc><pl><loc><@←ADVL-ine>/$^,<cm>/,<cm>/$ ^go<cnjsub><@CVP>/als<cnjsub><@CVP>/$ ^muđui<adv><@ADVL-ine→>/ansonsten<adv><@ADVL-ine→>/$ ^šaddat<vblex><iv><indic><pres><p3><pl><@mv>/werden<vblex><iv><indic><pres><p3><pl><@mv>/wachsen<vblex><iv><indic><pres><p3><pl><@mv>/$ ^eambbo<adv><@←ADVL-ine>/mehr<adv><@←ADVL-ine>/$ ^dakkár<prn><dem><attr><@→N>/solche<prn><dem><attr><@→N>/$ ^ášši<n><sem_semcon><pl><nom><@←SPRED>/Sache<n><f><sem_semcon><pl><nom><@←SPRED>/$ ^boazodoallu<n><sem_domain><sg><gen><@→P>/Rentierhaltung<n><f><sem_domain><sg><gen><@→P>/$ ^vuostá<post><@←ADVL-ine>/gegen<pr><@←ADVL-ine>/$ ^jus<cnjsub><@CVP>/wenn<cnjsub><@CVP>/ob<cnjsub><@CVP>/$ ^dát<prn><dem><sg><nom><@→N>/diese<prn><dem><dem><sg><nom><@→N>/$ ^duopmu<n><sem_prod><sg><nom><@SUBJ→>/Urteil<n><nt><sem_prod><sg><nom><@SUBJ→>/$ ^šaddat<vblex><iv><indic><pres><p3><sg><@mv>/werden<vblex><iv><indic><pres><p3><sg><@mv>/wachsen<vblex><iv><indic><pres><p3><sg><@mv>/$ ^dábálaš<adj><sg><nom><@←SUBJ>/typisch<adj><sg><nom><@←SUBJ>/$ ^riekti<n><sem_org_rule><cmp_sgnom><cmp>/Recht<n><nt><sem_org_rule><cmp_sgnom><cmp>/$ ^dáhpi<n><sem_rule><ess><@←SPRED>/Brauch<n><m><sem_rule><ess><@←SPRED>/$ ^Norga<np><top><sg><loc><@←ADVL-ine>/Norwegen<np><top><sg><loc><@←ADVL-ine>/$^.<sent>/.<sent>/$ ^¶<sent>/¶<sent>/$

and a valgrind:

==1498281== Memcheck, a memory error detector
==1498281== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==1498281== Using Valgrind-3.22.0 and LibVEX; rerun with -h for copyright info
==1498281== Command: rtx-proc /home/flammie/github/apertium/apertium-sme-deu/sme-deu.rtx.bin
==1498281== 
==1498281== Warning: client switching stacks?  SP change: 0x1ffeffdc20 --> 0x1ffed8c898
==1498281==          to suppress, use: --max-stackframe=2560904 or greater
==1498281== Invalid write of size 8
==1498281==    at 0x11A729: RTXProcessor::filterParseGraph() (in /usr/bin/rtx-proc)
==1498281==    by 0x11D7A4: RTXProcessor::processGLR(UFILE*) (in /usr/bin/rtx-proc)
==1498281==    by 0x11F453: RTXProcessor::process(_IO_FILE*, UFILE*) (in /usr/bin/rtx-proc)
==1498281==    by 0x10ED22: main (in /usr/bin/rtx-proc)
==1498281==  Address 0x1ffed8c898 is on thread 1's stack
==1498281==  in frame #0, created by RTXProcessor::filterParseGraph() (???:)
==1498281== 
==1498281== Invalid write of size 8
==1498281==    at 0x484D593: memset (vg_replace_strmem.c:1386)
==1498281==    by 0x11A72D: RTXProcessor::filterParseGraph() (in /usr/bin/rtx-proc)
==1498281==    by 0x11D7A4: RTXProcessor::processGLR(UFILE*) (in /usr/bin/rtx-proc)
==1498281==    by 0x11F453: RTXProcessor::process(_IO_FILE*, UFILE*) (in /usr/bin/rtx-proc)
==1498281==    by 0x10ED22: main (in /usr/bin/rtx-proc)
==1498281==  Address 0x1ffed8c8a0 is on thread 1's stack
==1498281==  in frame #1, created by RTXProcessor::filterParseGraph() (???:)
==1498281== 
==1498281== Invalid read of size 8
==1498281==    at 0x484D60D: memset (vg_replace_strmem.c:1386)
==1498281==    by 0x11A72D: RTXProcessor::filterParseGraph() (in /usr/bin/rtx-proc)
==1498281==    by 0x11D7A4: RTXProcessor::processGLR(UFILE*) (in /usr/bin/rtx-proc)
==1498281==    by 0x11F453: RTXProcessor::process(_IO_FILE*, UFILE*) (in /usr/bin/rtx-proc)
==1498281==    by 0x10ED22: main (in /usr/bin/rtx-proc)
==1498281==  Address 0x1ffed8c898 is on thread 1's stack
==1498281==  in frame #0, created by memset (vg_replace_strmem.c:1386)
==1498281== 
==1498281== Invalid read of size 4
==1498281==    at 0x119F74: RTXProcessor::filterParseGraph() (in /usr/bin/rtx-proc)
==1498281==    by 0x11D7A4: RTXProcessor::processGLR(UFILE*) (in /usr/bin/rtx-proc)
==1498281==    by 0x11F453: RTXProcessor::process(_IO_FILE*, UFILE*) (in /usr/bin/rtx-proc)
==1498281==    by 0x10ED22: main (in /usr/bin/rtx-proc)
==1498281==  Address 0x1ffed8c8a0 is on thread 1's stack
==1498281==  in frame #0, created by RTXProcessor::filterParseGraph() (???:)
==1498281== 
==1498281== Warning: client switching stacks?  SP change: 0x1ffed8c8a0 --> 0x1ffeffdce8
==1498281==          to suppress, use: --max-stackframe=2561096 or greater
==1498281== Warning: client switching stacks?  SP change: 0x1ffeffdc20 --> 0x1ffed8c898
==1498281==          to suppress, use: --max-stackframe=2560904 or greater
==1498281==          further instances of this message will not be shown.
==1498281== Invalid write of size 8
==1498281==    at 0x11A729: RTXProcessor::filterParseGraph() (in /usr/bin/rtx-proc)
==1498281==  Address 0x1ffd9e1e18 is on thread 1's stack
==1498281== 
==1498281== 
==1498281== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==1498281==  Access not within mapped region at address 0x1FFD9E1E18
==1498281==    at 0x11A729: RTXProcessor::filterParseGraph() (in /usr/bin/rtx-proc)
==1498281==  If you believe this happened as a result of a stack
==1498281==  overflow in your program's main thread (unlikely but
==1498281==  possible), you can try to increase the size of the
==1498281==  main thread stack using the --main-stacksize= flag.
==1498281==  The main thread stack size used in this run was 8388608.
==1498281== 
==1498281== HEAP SUMMARY:
==1498281==     in use at exit: 12,215,494,831 bytes in 32,130,654 blocks
==1498281==   total heap usage: 477,066,643 allocs, 444,935,989 frees, 135,790,349,643 bytes allocated
==1498281== 
==1498281== LEAK SUMMARY:
==1498281==    definitely lost: 0 bytes in 0 blocks
==1498281==    indirectly lost: 0 bytes in 0 blocks
==1498281==      possibly lost: 7,968 bytes in 15 blocks
==1498281==    still reachable: 12,215,486,863 bytes in 32,130,639 blocks
==1498281==                       of which reachable via heuristic:
==1498281==                         newarray           : 21,680 bytes in 1 blocks
==1498281==         suppressed: 0 bytes in 0 blocks
==1498281== Rerun with --leak-check=full to see details of leaked memory
==1498281== 
==1498281== For lists of detected and suppressed errors, rerun with: -s
==1498281== ERROR SUMMARY: 7707677 errors from 5 contexts (suppressed: 0 from 0)

I have not seen this kind of errors before so no idea of what's up...