Closed albbas closed 7 years ago
Date: 2010-12-02 12:03:08 +0100
From: Thomas Omma <
Åarjelsaemien, version 1.0, 2010-11-30
Generated compounds with PlGen as first part are not accepted hyphened: nïejti-moere maanaj-gaerteni
Unhyphened they are fine
Date: 2010-12-02 12:09:50 +0100
From: Thomas Omma <
these two generated compound wit PlGen as first part are now accepted without hyphen: nïejtimoere maanajgaerteni
new problem is that this compound is accepted WITH hyphen: eeki-nommh
it should not be accepted at all - they have not compound-tagging to allow it
it seems like the hyphen makes the compound-tagging unvalid
Åarjelsaemien, version 1.0, 2010-12-01
Date: 2010-12-03 09:56:54 +0100
From: Sjur Nørstebø Moshagen <
Changing priority etc.
Date: 2010-12-03 09:58:02 +0100
From: Sjur Nørstebø Moshagen <
Forgot to change status.
Date: 2011-08-08 12:05:35 +0200
From: Thomas Omma <
eeki-nommh still accepted Åarjelsaemien, version 1.1, 2011-05-26
Date: 2011-09-01 13:51:42 +0200
From: Thomas Omma <
Åarjelsaemien, version 1.1, 20110830-45217
status same as in comment4
Date: 2011-09-19 12:12:49 +0200
From: Tomi Pieski <
Why is eeki-nommh incorrect? fst recognizes it:
eeki-nommh eeki-nommh eeke+N+PlGenCmp+Hyph#nomme+N+Pl+Nom
Date: 2011-09-19 12:20:18 +0200
From: Thomas Omma <
yes, sorry, it is Ok word, I change in regr-file
Date: 2011-09-19 12:26:18 +0200
From: Thomas Omma <
doeuble-sorry
eeki-nommh And eekinommh are incorrect, even though fst recognizes them. This because of compounding-tags (they have none):
eeke:eek NIEJTE ; nomme:nomm NIEJTE ;
speller accepts eeki-nommh But not eekinommh
It must be due to the hyphen
Date: 2012-06-18 14:56:19 +0200
From: Maja Lisa Kappfjell <
Interesant! test!
Date: 2012-06-18 14:58:43 +0200
From: Maja Lisa Kappfjell <
Interesant! test!
Date: 2016-11-28 10:34:28 +0100
From: Sjur Nørstebø Moshagen <
Revisiting old bugs:
This bug is not related to PLX per se, but to the interpretation of the compounding tags. The bug is still with us if it is incorrect that the hyphenated form should NOT be accepted:
$ echo eeki-nommh | hfst-lookup -q src/analyser-gt-norm.hfstol eeki-nommh eeke+N+Cmp-#nomme+Num+Pl+Nom 10,000000
$ echo eekinommh | hfst-lookup -q src/analyser-gt-norm.hfstol eekinommh eekinommh+? inf
$ echo eeki-nommh | hfst-ospell -S tools/spellcheckers/fstbased/desktop/hfst/sma.zhfst "eeki-nommh" is in the lexicon... $ echo eekinommh | hfst-ospell -S tools/spellcheckers/fstbased/desktop/hfst/sma.zhfst "eekinommh" is in the lexicon...
$ echo eeki-nommh | hfst-lookup -q tools/spellcheckers/fstbased/desktop/analyser-desktopspeller-gt-norm.hfst eeki-nommh eeke+N+Cmp-#nomme+Num+Pl+Nom 24,495081 eeki-nommh eeke+N+Cmp/Hyph+Cmp#nomme+N+Pl+Nom 10024,495117
$ echo eekinommh | hfst-lookup -q tools/spellcheckers/fstbased/desktop/analyser-desktopspeller-gt-norm.hfst eekinommh eeke+N+Cmp#nomme+N+Pl+Nom 24,495081
It is also quite disturbing that the normative analyser and the speller behave differently.
The lexc entries for the words are:
eeke+Sem/Dummytag:eek NIEJTE ; nomme+Sem/Dummytag:nomm NIEJTE ;
meaning that eeke should only allow compounding in SgNom, and that nomme is not overriding it with a LeftCmp tag.
I will take over this bug for now.
Date: 2017-03-02 09:40:24 +0100
From: Sjur Nørstebø Moshagen <
Slight progress - the speller and the normative analyser now behaves the same:
$ echo eeki-nommh | hfst-lookup -q src/analyser-gt-norm.hfstol eeki-nommh eeke+N+Cmp-#nomme+Num+Pl+Nom 10,000000
$ echo eekinommh | hfst-lookup -q src/analyser-gt-norm.hfstol eekinommh eekinommh+? inf
$ echo eekinommh | hfst-lookup -q src/analyser-gt-desc.hfstol eekinommh eeke+N+Cmp/PlGen+Cmp#nomme+N+Pl+Nom 10,000000
$ echo eeki-nommh | hfst-ospell -S tools/spellcheckers/fstbased/desktop/hfst/sma.zhfst "eeki-nommh" is in the lexicon...
$ echo eekinommh | hfst-ospell -S tools/spellcheckers/fstbased/desktop/hfst/sma.zhfst "eekinommh" is NOT in the lexicon:
$ echo eeki-nommh | hfst-lookup -q tools/spellcheckers/fstbased/desktop/analyser-desktopspeller-gt-norm.hfst eeki-nommh eeke+N+Cmp-#nomme+Num+Pl+Nom 24,495081
$ echo eekinommh | hfst-lookup -q tools/spellcheckers/fstbased/desktop/analyser-desktopspeller-gt-norm.hfst eekinommh eekinommh+? inf
Date: 2017-03-03 12:05:32 +0100
From: Sjur Nørstebø Moshagen <
This specific compound is accepted with a hyphen because the last part is a numeral:
(In reply to Sjur Nørstebø Moshagen from comment #12)
Slight progress - the speller and the normative analyser now behaves the same:
$ echo eeki-nommh | hfst-lookup -q src/analyser-gt-norm.hfstol eeki-nommh eeke+N+Cmp-#nomme+Num+Pl+Nom 10,000000
Compounding with numerals require a hyphen (even when spelled out with letters - that is something that could be revisited at some time). The numeral 'nomme' can be inflected, and the end result is the analysis above (and thus the observed speller behavior).
That is, compounding and compounding restriction using tags is working as it should, and this is not a real bug.
If we want to restrict compounds with -nomme then that is another discussion and outside this bug report.
Closed as fixed (due to the originally reported words were fixed way back in 2010).
This issue was created automatically with bugzilla2github
Bugzilla Bug 915
Date: 2010-12-02T12:03:08+01:00 From: Thomas Omma <>
To: Sjur Nørstebø Moshagen <>
CC: maja.l.kappfjell, sjur.n.moshagen, thomas.omma, trond.trosterud
Last updated: 2017-03-03T12:05:32+01:00