divvun / libdivvun

lib for running gramcheck and other pipelines + cli; modules for CG→spelling, CG→feedback, tagging blanks
https://giellalt.github.io/proof/gramcheck/GrammarCheckerDocumentation.html
GNU General Public License v3.0
9 stars 1 forks source link

Forskjell mellom yaml-test og trace-smjgram-dev.mode #73

Open ilm024 opened 7 months ago

ilm024 commented 7 months ago

Yaml test failer selv om alt ser ut til å være rett. Det er altså forskjell mellom yaml-test og trace-smjgram-dev.mode

Yaml failer:

Test 2/3: Maŋenagi gå oahppe oahpásmuvvi dán ulmutjij.
----------
REL-lex-oahpasmuvvat-oahpastuvvat-FAIL.yaml
[2/3][PASS tp] oahpásmuvvi:oahpástuvvi (, ()) => oahpásmuvvi:[oahpástuvvi] (lex-oahpasmuvvat-oahpastuvvat)
REL-lex-oahpasmuvvat-oahpastuvvat-FAIL.yaml
[2/3][PASS tp] dán:dájna (, ()) => dán:[dájna] (msyn-plcom-sgcom-oahpastuvvat2)
REL-lex-oahpasmuvvat-oahpastuvvat-FAIL.yaml
[2/3][PASS tp] dán:dájna (, ()) => dán:[dájna] (msyn-pron-ill-com-oahpastuvvat)
REL-lex-oahpasmuvvat-oahpastuvvat-FAIL.yaml
[2/3][PASS tp] ulmutjij:ulmutjijn (, ()) => ulmutjij:[ulmutjijn] (msyn-pron-ill-com-oahpastuvvat2)
REL-lex-oahpasmuvvat-oahpastuvvat-FAIL.yaml
[2/3][FAIL fn1] ulmutjij:ulmutjijn (, ()) => ulmutjij:[] (msyn-plcom-sgcom-oahpastuvvat)
Test 2 - Passes: 4, Fails: 1, Total: 5

...Selv om echo "Maŋenagi gå oahppe oahpásmuvvi dán ulmutjij." | tools/grammarcheckers/modes/trace-smjgram-dev.mode | less -R viser at det fungerer som det skal:

"<Maŋenagi>"
        "maŋenagi" Adv <smj> <smj> <W:0.0> SUBSTITUTE:4310 SUBSTITUTE:4309
: 
"<gå>"
        "gå" CS <smj> <smj> <W:0.0> MAP:2054 SELECT:2141 SUBSTITUTE:4318 @CVP SUBSTITUTE:4317
;       "gå" CS <W:0.0> @CNP MAP:2054 SELECT:2141
: 
"<oahppe>"
        "oahppat" Ex/V TV Der/NomAg N <smj> <smj> Pl Nom <W:0.0> SELECT:2824 SUBSTITUTE:4309 SUBSTITUTE:4308
        "oahppe" N <smj> <smj> Sem/Hum Gram/NomAg Pl Nom <W:0.0> SELECT:2824 SUBSTITUTE:4309 SUBSTITUTE:4308
;       "oahppat" Ex/V TV Der/NomAg N Sg Gen <W:0.0> SELECT:2824
;       "oahppat" Ex/V TV Der/NomAg N Sg Nom <W:0.0> SELECT:2824
;       "oahppat" V <TH-FS> TV Imprt Du2 <W:0.0> SUBSTITUTE:1037 SELECT:2824
;       "oahppat" V <TH-FS> TV PrsPrc <W:0.0> SUBSTITUTE:1037 SELECT:2824
;       "oahppe" N Sem/Hum Gram/NomAg Sg Gen <W:0.0> SELECT:2824
;       "oahppe" N Sem/Hum Gram/NomAg Sg Nom <W:0.0> SELECT:2824
: 
"<oahpásmuvvi>"         oahpásmuvvi     →  oahpástuvvi
        "oahpásmuvvat" V <smj> <smj> <TH-Ill-*Ani> IV Ind Prs Pl3 <W:0.0> SUBSTITUTE:1025 MAP:3498 SELECT:4176 SUBSTITUTE:4311 @+FMAINV SUBSTITUTE:4310 &lex-oahpasmuvvat-oahpastuvvat ADD:2126:lex-oahpasmuvvat-oahpastuvvat COPY:2129:lex-oahpasmuvvat-oahpastuvvat
lex-oahpasmuvvat-oahpastuvvat
        "oahpástuvvat" V <smj> <smj> <TH-Ill-*Ani> IV Ind Prs Pl3 <W:0.0> SUBSTITUTE:1025 MAP:3498 SELECT:4176 SUBSTITUTE:4311 @+FMAINV SUBSTITUTE:4310 &SUGGEST ADD:2126:lex-oahpasmuvvat-oahpastuvvat COPY:2129:lex-oahpasmuvvat-oahpastuvvat
oahpástuvvat+V+IV+Ind+Prs+Pl3   oahpástuvvi
;       "oahpásmuvvat" V <TH-Ill-*Ani> IV Ind Prt Sg2 <W:0.0> SUBSTITUTE:1025 @+FMAINV MAP:3498 SELECT:4176
: 
"<dán>"         dán     →  dájna        dán     →  dájna
        "dát" Pron <smj> <smj> Dem Sg Ill Attr <W:0.0> SELECT:3810 SUBSTITUTE:4316 SUBSTITUTE:4315 &msyn-plcom-sgcom-oahpastuvvat2 ADD:2225:msyn-pron-ill-com-oahpastuvvat COPY:2227:msyn-pron-ill-com-oahpastuvvat ADD:2252:msyn-plcom-sgcom-oahpastuvvat2 COPY:2254:msyn-plcom-sgcom-oahpastuvvat2
msyn-plcom-sgcom-oahpastuvvat2
        "dát" Pron <smj> <smj> Dem Sg Ill Attr <W:0.0> SELECT:3810 SUBSTITUTE:4316 SUBSTITUTE:4315 &msyn-pron-ill-com-oahpastuvvat ADD:2225:msyn-pron-ill-com-oahpastuvvat COPY:2227:msyn-pron-ill-com-oahpastuvvat ADD:2252:msyn-plcom-sgcom-oahpastuvvat2
msyn-pron-ill-com-oahpastuvvat
        "dát" Pron <smj> <smj> Dem Sg <W:0.0> SELECT:3810 SUBSTITUTE:4316 SUBSTITUTE:4315 Com &SUGGEST ADD:2225:msyn-pron-ill-com-oahpastuvvat COPY:2227:msyn-pron-ill-com-oahpastuvvat ADD:2252:msyn-plcom-sgcom-oahpastuvvat2 COPY:2254:msyn-plcom-sgcom-oahpastuvvat2
dát+Pron+Dem+Sg+Com     dájna
        "dát" Pron <smj> <smj> Dem Sg <W:0.0> SELECT:3810 SUBSTITUTE:4316 SUBSTITUTE:4315 Com &SUGGEST ADD:2225:msyn-pron-ill-com-oahpastuvvat COPY:2227:msyn-pron-ill-com-oahpastuvvat
dát+Pron+Dem+Sg+Com     dájna
;       "dát" Pron Dem Sg Gen <W:0.0> SELECT:3810
;       "dát" Pron Dem Sg Gen Attr <W:0.0> SELECT:3810
;       "dát" Pron Dem Sg Ine Attr <W:0.0> SELECT:3810
: 
"<ulmutjij>"            ulmutjij        →  ulmutjijn    ulmutjij        →  ulmutjijn
        "ulmusj" N <smj> <smj> Sem/Hum Pl Com <W:0.0> SUBSTITUTE:4309 SUBSTITUTE:4308 &msyn-plcom-sgcom-oahpastuvvat ADD:2245:msyn-plcom-sgcom-oahpastuvvat COPY:2247:msyn-plcom-sgcom-oahpastuvvat
msyn-plcom-sgcom-oahpastuvvat
        "ulmusj" N <smj> <smj> Sem/Hum <W:0.0> SUBSTITUTE:4309 SUBSTITUTE:4308 Sg Com &SUGGEST ADD:2245:msyn-plcom-sgcom-oahpastuvvat COPY:2247:msyn-plcom-sgcom-oahpastuvvat
ulmusj+N+Sg+Com ulmutjijn,ulmutjijn,ulmutjijn
        "ulmusj" N <smj> <smj> Sem/Hum Sg Ill <W:0.0> SUBSTITUTE:4309 SUBSTITUTE:4308 &msyn-pron-ill-com-oahpastuvvat2 ADD:2236:msyn-pron-ill-com-oahpastuvvat2 COPY:2239:msyn-pron-ill-com-oahpastuvvat2
msyn-pron-ill-com-oahpastuvvat2
        "ulmusj" N <smj> <smj> Sem/Hum Sg <W:0.0> SUBSTITUTE:4309 SUBSTITUTE:4308 Com &msyn-pron-ill-com-oahpastuvvat2 &SUGGEST ADD:2236:msyn-pron-ill-com-oahpastuvvat2 COPY:2239:msyn-pron-ill-com-oahpastuvvat2
ulmusj+N+Sg+Com ulmutjijn,ulmutjijn,ulmutjijn
;       "ulmusj" N Sem/Hum Pl Gen <W:0.0> REMOVE:3659
"<.>"
        "." CLB <W:0.0> 
:\n
(END)
albbas commented 7 months ago

Når jeg kjører kommandoen echo "Maŋenagi gå oahppe oahpásmuvvi dán ulmutjij." | divvun-checker -a smj.zcheck -n smjgram | jq .

{
  "errs": [
    [
      "oahpásmuvvi",
      19,
      30,
      "lex-oahpasmuvvat-oahpastuvvat",
      "\"oahpásmuvvat\" galggá liehket \"oahpástuvvat\" dán kontevstan",
      [
        "oahpástuvvi"
      ],
      "Boasto báhko \"oahpásmuvvat\""
    ],
    [
      "dán",
      31,
      34,
      "msyn-plcom-sgcom-oahpastuvvat2",
      "msyn-plcom-sgcom-oahpastuvvat2",
      [
        "dájna"
      ],
      "msyn-plcom-sgcom-oahpastuvvat2"
    ],
    [
      "dán",
      31,
      34,
      "msyn-pron-ill-com-oahpastuvvat",
      "msyn-pron-ill-com-oahpastuvvat",
      [
        "dájna"
      ],
      "msyn-pron-ill-com-oahpastuvvat"
    ],
    [
      "ulmutjij",
      35,
      43,
      "msyn-plcom-sgcom-oahpastuvvat",
      "msyn-plcom-sgcom-oahpastuvvat",
      [],
      "msyn-plcom-sgcom-oahpastuvvat"
    ],
    [
      "ulmutjij",
      35,
      43,
      "msyn-pron-ill-com-oahpastuvvat2",
      "msyn-pron-ill-com-oahpastuvvat2",
      [
        "ulmutjijn"
      ],
      "msyn-pron-ill-com-oahpastuvvat2"
    ]
  ],
  "text": "Maŋenagi gå oahppe oahpásmuvvi dán ulmutjij."
}

Mangler regelen msyn-plcom-sgcom-oahpastuvvat outputet som vises i trace-smjgram-dev.mode

unhammer commented 7 months ago

Altså at forslaget ulmutjijn manglar? I så fall får eg òg reprodusert forskjellen:

$ sed '$s/$/ -j/' modes/smjgram.mode >modes/smjgram-j.mode
$ sed '$s/$/ -j/' modes/trace-smjgram-dev.mode >modes/trace-smjgram-dev-j.mode
$ diff -u <(echo "Maŋenagi gå oahppe oahpásmuvvi dán ulmutjij." | sh modes/trace-smjgram-dev-j.mode 2>/dev/null|jq) <(echo "Maŋenagi gå oahppe oahpásmuvvi dán ulmutjij." | sh modes/smjgram-j.mode 2>/dev/null|jq)
--- /dev/fd/63  2024-04-29 16:45:54.487354895 +0200
+++ /dev/fd/62  2024-04-29 16:45:54.491353988 +0200
@@ -39,9 +39,7 @@
       43,
       "msyn-plcom-sgcom-oahpastuvvat",
       "msyn-plcom-sgcom-oahpastuvvat",
-      [
-        "ulmutjijn"
-      ],
+      [],
       "msyn-plcom-sgcom-oahpastuvvat"
     ],
     [
unhammer commented 7 months ago

I --trace-modus så gir vislcg3 ei lesing som me ikkje får utan trace; det har å gjera med at vislcg3 kan slå ihop lesingar som er like utanom MAPPING-tags når den blir køyrt utan --trace, mens i trace-modus blir dei separate.

Utan trace får me berre &SUGGEST&msyn-pron-ill-com-oahpastuvvat2:

$ echo "Maŋenagi gå oahppe oahpásmuvvi dán ulmutjij." | sh modes/smjgram9-cg.mode|vislcg3  -g grammarchecker.bin |grep ulm
"<ulmutjij>"
        "ulmusj" N <smj> <smj> Sem/Hum Pl Com <W:0.0> &msyn-plcom-sgcom-oahpastuvvat
        "ulmusj" N <smj> <smj> Sem/Hum Sg Ill <W:0.0> &msyn-pron-ill-com-oahpastuvvat2
        "ulmusj" N <smj> <smj> Sem/Hum Sg <W:0.0> Com &msyn-pron-ill-com-oahpastuvvat2 &SUGGEST

Dette tolkar divvun-suggest som at forslaget berre høyrer til &msyn-pron-ill-com-oahpastuvvat2 og ikkje til &msyn-plcom-sgcom-oahpastuvvat, derfor får berre &msyn-pron-ill-com-oahpastuvvat2 forslaget.

Mens viss me køyrer med trace får me òg &SUGGEST på ei lesing for seg sjølv (utan andre &-taggar), i tillegg til på ei lesing saman med &msyn-pron-ill-com-oahpastuvvat2:

$ echo "Maŋenagi gå oahppe oahpásmuvvi dán ulmutjij." | sh modes/smjgram9-cg.mode|vislcg3 --trace  -g grammarchecker.bin |grep ulm
"<ulmutjij>"
        "ulmusj" N <smj> <smj> Sem/Hum Pl Com <W:0.0> &msyn-plcom-sgcom-oahpastuvvat ADD:2297:msyn-plcom-sgcom-oahpastuvvat COPY:2303:msyn-plcom-sgcom-oahpastuvvat
        "ulmusj" N <smj> <smj> Sem/Hum <W:0.0> Sg Com &SUGGEST ADD:2297:msyn-plcom-sgcom-oahpastuvvat COPY:2303:msyn-plcom-sgcom-oahpastuvvat
        "ulmusj" N <smj> <smj> Sem/Hum Sg Ill <W:0.0> &msyn-pron-ill-com-oahpastuvvat2 ADD:2271:msyn-pron-ill-com-oahpastuvvat2 COPY:2276:msyn-pron-ill-com-oahpastuvvat2
        "ulmusj" N <smj> <smj> Sem/Hum Sg <W:0.0> Com &msyn-pron-ill-com-oahpastuvvat2 &SUGGEST ADD:2271:msyn-pron-ill-com-oahpastuvvat2 COPY:2276:msyn-pron-ill-com-oahpastuvvat2

Denne lesinga med berre &SUGGEST for seg sjølv blir då tilgjengeleg for forslag med &msyn-plcom-sgcom-oahpastuvvat.


Det ser ut som det er regelen

COPY:msyn-pron-ill-com-oahpastuvvat2 (Com &SUGGEST) EXCEPT (Ill &msyn-proncom-ill-oahpasmuvvat2) OR (Ill Attr &msyn-proncom-ill-oahpasmuvvat2) TARGET (&msyn-pron-ill-com-oahpastuvvat2) (NEGATE (*0 &lex-oahpastuvvat-oahpasmuvvat)) ;

som gir den lesinga med både &SUGGEST og &msyn-pron-ill-com-oahpastuvvat2.

Eg antar at meininga var at den regelen skulle ha

EXCEPT &msyn-pron-ill-com-oahpastuvvat2

og ikkje

EXCEPT &msyn-proncom-ill-oahpasmuvvat2

Eg sjekkar inn den endringa, som ser ut til å løysa problemet for denne gong.


Desse forskjellane mellom trace og ikkje-trace har skjedd før, så det trengst ei meir varig løysing. I vislcg3 finst eit val --split-mappings som deler opp alle lesingar som har meir enn éin &-tagg, men då mistar me moglegheita til å seia at «denne &SUGGEST-lesinga skal vera på same lesing som denne feiltaggen» (noko som er nyttig viss me har to alternative forslag som høyrer til kvar sine ulike feiltaggar).

Kanskje me rett og slett skal fjerna &-teiknet frå &SUGGEST; eg må testa kva effekt det har.

unhammer commented 7 months ago

Eg oppdaterte libdivvun til å tolka taggen SUGGEST som &SUGGEST. Viss ein bruker SUGGEST utan & så vil aldri vislcg3 slå ihop slike lesingar, og då slepp ein den forskjellen mellom trace og ikkje-trace.

Eg prøvde å køyra testane i lang-smj med alle &SUGGEST-taggane endra til SUGGEST, og det gir fleire PASS, men òg fleire FAIL, så eg er litt usikker på korleis eg skal tolka det :-)

unhammer commented 6 months ago

Nyaste libdivvun er i nightly, så om dokker har oppdatert så skal det altså gå an å endra alle &SUGGEST til berre SUGGEST – då skal det alltid bli likt mellom trace og ikkje-trace.