divvun / libdivvun

lib for running gramcheck and other pipelines + cli; modules for CG→spelling, CG→feedback, tagging blanks
https://giellalt.github.io/proof/gramcheck/GrammarCheckerDocumentation.html
GNU General Public License v3.0
9 stars 1 forks source link

Bad underlining, suggestions for double-space err after misspelling #31

Closed snomos closed 8 months ago

snomos commented 5 years ago

[This bug could also be caused by the CG rules, but I am posting it here to begin with]

Given the command (there should be two spaces after the first word of the input sentence):

echo "Riggeroahkká lea gitta suomuoras, ja vuohkui mii heaŋga vuolimuš sáhttá bidjat gáfegievnni dahje málesgievnni." | tools/grammarcheckers/modes/trace-smegramrelease.mode

one gets the following output:

"<Riggeroahkká>"
    "reahkká" N Sem/Ani Sg Nom <W:37.3018> <WA:17.3018> <spelled> "<diggereahkká>" PROTECT:3268 SELECT:3578:r868 &double-space-before ID:1 ADD:4001:double-space-before-link ADD:4033:spelled
        "diggi" N Sem/Org Cmp/SgNom Cmp ID:1
double-space-before
    "reahkká" N Sem/Ani Sg Nom <W:37.3018> <WA:17.3018> <spelled> "<diggereahkká>" PROTECT:3268 SELECT:3578:r868 &LINK &typo &SUGGESTWF ID:1 ADD:4001:double-space-before-link ADDRELATION($2):4002:double-space-before-rel ADDRELATION(LEFT):4003:double-space-before-rel ADD:4033:spelled
        "diggi" N Sem/Org Cmp/SgNom Cmp ID:1
typo
    "roahkki" N Sem/Dummytag Sg Nom <W:37.3018> <WA:17.3018> <spelled> "<diggeroahkki>" PROTECT:3268 SELECT:3578:r868 &LINK &double-space-before &typo &SUGGESTWF ID:1 ADD:4001:double-space-before-link ADD:4033:spelled
        "diggi" N Sem/Org Cmp/SgNom Cmp ID:1
double-space-before
typo
...
;   "Riggeroahkká" ? SELECT:3578:r868
:  
"<lea>"
    "leat" V <TH-Nom-Any> <mielde> <OR-Loc-HumGroup> <OR-eret-Plc> <dušše><TH-Inf> <árvvus> <LO-Loc-johtu><DE-Ill-Plc> <AT-Loc-Mat> <AT-Abe-Any> <AT-Nom-Any> <AT-Nom-Adj><EX-Ill-Ani> <PO-Loc-Hum> <PO-Gen-Hum> <MA-mielde-Any> <MA-Adv-Manner> <XT-Gen-Measr> <LO-maŋŋil-Time> <LO-Acc-Time> <LO-Loc-Time> <CO-Com-Ani> <ID-Nom-Any> <TH-Nom-Any><RO-Ess-Any><EX-Ill-Any> <EX-Ill-Ani><TH-Nom-Adj> <EX-Ill-Ani> <TH-Nom-Obj><RE-Ill-Ani> <LO-Loc-Any> <AktioEss> <BE-Ill-Ani><PU-Ess-Any> <RO-Ess-Any><PU-Ill-Act> <RO-Ess-Any> IV Ind Prs Sg3 <W:0.0> <doubleSpaceBefore> SUBSTITUTE:2764 SUBSTITUTE:2772 SUBSTITUTE:2979 SUBSTITUTE:3083 SUBSTITUTE:3121 SUBSTITUTE:3494 SUBSTITUTE:3593 SUBSTITUTE:3598 SUBSTITUTE:3605 SUBSTITUTE:3616 SUBSTITUTE:3704 SUBSTITUTE:3834 SUBSTITUTE:3845 SUBSTITUTE:3849 SUBSTITUTE:3999 SUBSTITUTE:4158 SUBSTITUTE:4169 SUBSTITUTE:4281 SUBSTITUTE:4286 SUBSTITUTE:4299 SUBSTITUTE:4306 SUBSTITUTE:4312 SUBSTITUTE:4539 SUBSTITUTE:4637 SUBSTITUTE:4698 SUBSTITUTE:4721 SUBSTITUTE:4727 SUBSTITUTE:4759 SUBSTITUTE:4875 @+FMAINV MAP:15822 &double-space-before ID:2 R:$2:1 R:LEFT:1 ADD:4000:double-space-before
double-space-before
    "leat" V <TH-Nom-Any> <mielde> <OR-Loc-HumGroup> <OR-eret-Plc> <dušše><TH-Inf> <árvvus> <LO-Loc-johtu><DE-Ill-Plc> <AT-Loc-Mat> <AT-Abe-Any> <AT-Nom-Any> <AT-Nom-Adj><EX-Ill-Ani> <PO-Loc-Hum> <PO-Gen-Hum> <MA-mielde-Any> <MA-Adv-Manner> <XT-Gen-Measr> <LO-maŋŋil-Time> <LO-Acc-Time> <LO-Loc-Time> <CO-Com-Ani> <ID-Nom-Any> <TH-Nom-Any><RO-Ess-Any><EX-Ill-Any> <EX-Ill-Ani><TH-Nom-Adj> <EX-Ill-Ani> <TH-Nom-Obj><RE-Ill-Ani> <LO-Loc-Any> <AktioEss> <BE-Ill-Ani><PU-Ess-Any> <RO-Ess-Any><PU-Ill-Act> <RO-Ess-Any> IV Ind Prs Sg3 <W:0.0> <doubleSpaceBefore> SUBSTITUTE:2764 SUBSTITUTE:2772 SUBSTITUTE:2979 SUBSTITUTE:3083 SUBSTITUTE:3121 SUBSTITUTE:3494 SUBSTITUTE:3593 SUBSTITUTE:3598 SUBSTITUTE:3605 SUBSTITUTE:3616 SUBSTITUTE:3704 SUBSTITUTE:3834 SUBSTITUTE:3845 SUBSTITUTE:3849 SUBSTITUTE:3999 SUBSTITUTE:4158 SUBSTITUTE:4169 SUBSTITUTE:4281 SUBSTITUTE:4286 SUBSTITUTE:4299 SUBSTITUTE:4306 SUBSTITUTE:4312 SUBSTITUTE:4539 SUBSTITUTE:4637 SUBSTITUTE:4698 SUBSTITUTE:4721 SUBSTITUTE:4727 SUBSTITUTE:4759 SUBSTITUTE:4875 @+FMAINV MAP:15822 "<diggereahkká lea>" &double-space-before &SUGGESTWF ID:2 R:$2:1 R:LEFT:1 ADD:4000:double-space-before COPY:4004:double-space-before
double-space-before
: 
"<gitta>"
    "gitta" Adv <W:0.0> @<ADVL MAP:22784

In LibreOffice it looks like the following:

Skjermbilde 2019-08-14 kl  09 29 07

When applying the suggested correction, the result is the following:

Skjermbilde 2019-08-14 kl  09 39 11

The relevant CG rule is this (in the file tools/grammarcheckers/grammarchecker-release.cg3):

COPY:double-space-before ("<$2 $1>"v &SUGGESTWF) TARGET ("<(.*)>"r &double-space-before) IF (T:prevWordCrossSent LINK 0 ("<(.*)>"r)) (NOT 0 (&LINK)) ;

The main problem is that there is no way for the regex to differentiate between the word form of the input, and the word forms given as suggestions by the speller; it seems that it randomly selects the second speller suggestion to be included in the grammar error suggestion. There should probably be a way of saying that one wants the first matching word form (which would be the input word form).

Also, for whatever reason, the blue underlining does not cover the whole error, which leads to a circular correction pattern, as the error persists (the two spaces are not replaced, only the following word), each time with the second speller suggestion as the "added" word.

unhammer commented 4 years ago

Is this one fixed now, on the command line at least?

snomos commented 4 years ago

It is fixed on the command line. I will keep it open until it is confirmed working properly in LO as well.

unhammer commented 4 years ago

right-clicking it:

bilde

after selecting the fix: bilde

seems better?

snomos commented 4 years ago

seems better?

yes 👍

But why is there an empty suggestion below the proper suggestion? And what happens if you select that one?

unhammer commented 4 years ago

Removes the words :O seems like a bug in libreoffice-divvun

unhammer commented 4 years ago

Actually, that seems to be because of the VSTR thing I just fixed in grammarchecker-release.cg3, should be ok again in newest svn of giella-sme.

(Managed to reproduce on linux too with yesterday's giella-sme vs manually built one.)

unhammer commented 8 months ago

seems like we fixed this one