giellalt / lang-sme

Finite state and Constraint Grammar based analysers and proofing tools, and language resources for the Northern Sami language
https://giellalt.uit.no
GNU General Public License v3.0
6 stars 1 forks source link

"Bures boahttin" med stor B blir feilaktig disambiguert #73

Closed albbas closed 11 months ago

albbas commented 11 months ago

Med stor b i dette uttrykket blir "Bures" tolket som "Bure" N Prop Sem/Mal Sg Loc

echo Bures boahttin Davvi-Norgga gávppašeapmái \
| hfst-tokenise --print-all --giella-cg --unique $GTLANGS/lang-sme/tools/tokenisers/tokeniser-disamb-gt-desc.pmhfst \
| vislcg3 --grammar $GTLANGS/lang-sme/src/cg3/disambiguator.cg3
"<Bures>"
    "Bure" N Prop Sem/Mal Sg Loc <W:0.0> <sme>
: 
"<boahttin>"
    "boahtti" N NomAg Sem/Hum Ess <W:0.0> <sme>
: 
"<Davvi-Norgga gávppašeapmái>"
    "Davvi-Norgga gávppašeapmi" N Sem/Act Sg Ill <W:0.0> <sme>
echo Don leat Bures boahttin min guovlui \
| hfst-tokenise --print-all --giella-cg --unique $GTLANGS/lang-sme/tools/tokenisers/tokeniser-disamb-gt-desc.pmhfst \
| vislcg3 --grammar $GTLANGS/lang-sme/src/cg3/disambiguator.cg3
"<Don>"
    "don" Pron Pers Sg2 Nom <W:0.0> <sme> sentinit
: 
"<leat>"
    "leat" V IV Ind Prs Sg2 <W:0.0> <sme> @+FMAINV
: 
"<Bures>"
    "Bure" N Prop Sem/Mal Sg Loc <W:0.0> <sme>
: 
"<boahttin>"
    "boahtti" N NomAg Sem/Hum Ess <W:0.0> <sme>
: 
"<min>"
    "mun" Pron Pers Pl1 Gen <W:0.0> <sme>
: 
"<guovlui>"
    "guovlu" N Sem/Plc Sg Ill Err/Orth <W:0.0> <sme>

Bare "Bures boahttin" blir disambiguert riktig:

echo Bures boahttin \
| hfst-tokenise --print-all --giella-cg --unique $GTLANGS/lang-sme/tools/tokenisers/tokeniser-disamb-gt-desc.pmhfst \
| vislcg3 --grammar $GTLANGS/lang-sme/src/cg3/disambiguator.cg3
"<Bures>"
    "bures" Interj <W:0.0> <sme>
: 
"<boahttin>"
    "boahtti" N NomAg Sem/Hum Ess <W:0.0> <sme>
albbas commented 11 months ago

"bures boahttin" is a typo, @leneantonsen improved the disambiguation result with 8b203b3f75635855201b42b034d137a731626ea1 and a965fc832dd359997018a618ceec62d1fe4671c9