giellalt / lang-sme

Finite state and Constraint Grammar based analysers and proofing tools, and language resources for the Northern Sami language
https://giellalt.uit.no
GNU General Public License v3.0
6 stars 1 forks source link

MWE warnings where there are no MWEs #54

Closed lynnda-hill closed 1 year ago

lynnda-hill commented 2 years ago

When analyzing a sentence in tools/grammarcheckers I get strange warnings that weren't there before the summer:

divvun-suggest: WARNING: Broken MWE wordform in analyses: biire
"biire" N <NomGenSg> Err/Orth Sem/Org Sg Acc <W:0.0> "<biire>" <cohort-with-dynamic-compound> <cohort-with-dynamic-compound> ADD:2156 ADD:2156 <doubleSpaceBefore> @OBJ> SUBSTITUT:divvun-suggest: WARNING: Broken MWE wordform in analyses: válga
divvun-suggest: WARNING: Broken MWE wordform in analyses: biire
divvun-suggest: WARNING: Broken MWE wordform in analyses: válga
divvun-suggest: WARNING: Broken MWE wordform in analyses: biire
divvun-suggest: WARNING: Broken MWE wordform in analyses: válga
divvun-suggest: WARNING: Broken MWE wordform in analyses: biire
divvun-suggest: WARNING: Broken MWE wordform in analyses: válga
divvun-suggest: WARNING: Broken MWE wordform in analyses: biire
divvun-suggest: WARNING: Broken MWE wordform in analyses: válga
divvun-suggest: WARNING: Broken MWE wordform in analyses: njuolggadusaid
divvun-suggest: WARNING: Broken MWE wordform in analyses: doaibma
lynnda-hill commented 1 year ago

try

echo "Dát miellahtut oaivvildeaba ahte boazodoallošiehtadus galgá láhččit saji eanet ovttasbargui \
boazodoalu ja eará meahccegeavaheddjiid gaskka." \
| tools/grammarcheckers/modes/trace-smegramrelease-dev.mode \
| less
flammie commented 1 year ago

I cannot reproduce this but the error should only print for when there is a "<surface>" in the analysis section of cg tags... it would appear on meahccegeavaheddjiid but my version of pipeline does not produce the "<surface>" tags even though it seems ambiguous:

echo "Dát miellahtut oaivvildeaba ahte boazodoallošiehtadus galgá láhččit saji eanet ovttasbargui boazodoalu ja eará meahccegeavaheddjiid gaskka." | tools/grammarcheckers/modes/trace-smegramrelease-dev.mode
"<Dát>"
    "dát" Pron Dem Pl Nom <W:0.0> <firstCohort> @>N SELECT:14448:r1626 MAP:22108:r20 #1->1
;   "dát" Pron Dem Sg Nom <W:0.0> <firstCohort> SELECT:14448:r1626
: 
...
...
"<meahccegeavaheddjiid>"
    "meahccegeavaheaddji" N NomAg Sem/Hum Pl Acc <W:0.0> <cohort-with-dynamic-compound> <cohort-with-dynamic-compound> ADD:2156 ADD:2156 #14->15
    "meahccegeavaheaddji" N NomAg Sem/Hum Pl Gen <W:0.0> <cohort-with-dynamic-compound> <cohort-with-dynamic-compound> ADD:2156 ADD:2156 @>P MAP:22719:r207 #14->15 SETPARENT:4113 SETPARENT:5589 SETPARENT:4113 SETPARENT:5589 SETPARENT:4113 SETPARENT:5589
    "meahccegeavaheaddji" N Sem/Hum Err/Orth NomAg Pl Acc <W:0.0> <cohort-with-dynamic-compound> <cohort-with-dynamic-compound> ADD:2156 ADD:2156 #14->15
    "meahccegeavaheaddji" N Sem/Hum Err/Orth NomAg Pl Gen <W:0.0> <cohort-with-dynamic-compound> <cohort-with-dynamic-compound> ADD:2156 ADD:2156 @>P MAP:22719:r207 #14->15
    "geavaheaddji" N NomAg Sem/Hum Pl Acc <W:0.0> <cohort-with-dynamic-compound> <cohort-with-dynamic-compound> ADD:2156 ADD:2156 #14->15
        "meahcci" N Sem/Plc Cmp/SgNom Cmp <W:0.0> #14->15
    "geavaheaddji" N NomAg Sem/Hum Pl Gen <W:0.0> <cohort-with-dynamic-compound> <cohort-with-dynamic-compound> ADD:2156 ADD:2156 @>P MAP:22719:r207 #14->15
        "meahcci" N Sem/Plc Cmp/SgNom Cmp <W:0.0> #14->15
    "geavaheaddji" N Sem/Hum Err/Orth NomAg Pl Acc <W:0.0> <cohort-with-dynamic-compound> <cohort-with-dynamic-compound> ADD:2156 ADD:2156 #14->15
        "meahcci" N Sem/Plc Cmp/SgNom Cmp <W:0.0> #14->15
    "geavaheaddji" N Sem/Hum Err/Orth NomAg Pl Gen <W:0.0> <cohort-with-dynamic-compound> <cohort-with-dynamic-compound> ADD:2156 ADD:2156 @>P MAP:22719:r207 #14->15
        "meahcci" N Sem/Plc Cmp/SgNom Cmp <W:0.0> #14->15
    "geavahit" Ex/V TV Gram/3syll Der/NomAg N Pl Acc <W:0.0> <cohort-with-dynamic-compound> <cohort-with-dynamic-compound> ADD:2156 ADD:2156 #14->15
        "meahcci" N Sem/Plc Cmp/SgNom Cmp <W:0.0> #14->15
    "geavahit" Ex/V TV Gram/3syll Der/NomAg N Pl Gen <W:0.0> <cohort-with-dynamic-compound> <cohort-with-dynamic-compound> ADD:2156 ADD:2156 @>P MAP:22719:r207 #14->15
        "meahcci" N Sem/Plc Cmp/SgNom Cmp <W:0.0> #14->15
;   "geavvat" Ex/V Ex/IV Der/h Ex/V TV Der/NomAg N Pl Acc <W:0.0> <cohort-with-dynamic-compound> <cohort-with-dynamic-compound> ADD:2156 ADD:2156 REMOVE:3984:r3136
;       "meahcci" N Sem/Plc Cmp/SgNom Cmp <W:0.0>
;   "geavvat" Ex/V Ex/IV Der/h Ex/V TV Der/NomAg N Pl Gen <W:0.0> <cohort-with-dynamic-compound> <cohort-with-dynamic-compound> ADD:2156 ADD:2156 REMOVE:3984:r3136
;       "meahcci" N Sem/Plc Cmp/SgNom Cmp <W:0.0>
...