giellalt / lang-sme

Finite state and Constraint Grammar based analysers and proofing tools, and language resources for the Northern Sami language
https://giellalt.uit.no
GNU General Public License v3.0
6 stars 1 forks source link

prefix ii- fungerer ikke som det skal ( #268

Open albbas opened 8 years ago

albbas commented 8 years ago

This issue was created automatically with bugzilla2github

Bugzilla Bug 2231

Date: 2016-10-17T12:59:53+02:00 From: Lene Antonsen <> To: Thomas Omma <> CC: ciprian.gerstenberger, lene.antonsen, linda.wiechetek, sandra.rahka, sjur.n.moshagen, trond.trosterud

Last updated: 2016-12-19T08:43:25+01:00

albbas commented 8 years ago

Comment 11566

Date: 2016-10-17 12:59:53 +0200 From: Lene Antonsen <>

prefix ii- fungerer ikke som det skal

Her får vi flere typer Err-tagger, men hovedproblemet er former som +Err/Orthstáhta. Årsaka er compunding.lexc: ii-+Err/Orth+Use/Circ:ii- Noun ; ii-+Err/Orth+Use/Circ:ii- Adjective ; ! ii-biologalaš

Men det går flere stier.

usme ii-stáhtalaš ii-stáhtalaš ii-+N+Err/HyphSub+Cmp/SgNom+Cmp/Hyph+Cmp#stáhta+Err/Orth+N+Der/lasj+A+Attr ii-stáhtalaš ii-+N+Err/HyphSub+Cmp/SgNom+Cmp/Hyph+Cmp#stáhta+Err/Orth+N+Der/lasj+A+Sg+Nom ii-stáhtalaš ii-+N+Err/HyphSub+Cmp/SgNom+Cmp/Hyph+Cmp#stáhta+v1+N+Der/lasj+A+Attr ii-stáhtalaš ii-+N+Err/HyphSub+Cmp/SgNom+Cmp/Hyph+Cmp#stáhta+v1+N+Der/lasj+A+Sg+Nom ii-stáhtalaš ii-+N+Err/HyphSub+Cmp/SgNom+Cmp/Hyph+Cmp#stáhta+Err/Orth+N+Der/lasj+A+Attr ii-stáhtalaš ii-+N+Err/HyphSub+Cmp/SgNom+Cmp/Hyph+Cmp#stáhta+Err/Orth+N+Der/lasj+A+Sg+Nom ii-stáhtalaš ii-+N+Err/HyphSub+Cmp/SgNom+Cmp/Hyph+Cmp#stáhta+v1+N+Der/lasj+A+Attr ii-stáhtalaš ii-+N+Err/HyphSub+Cmp/SgNom+Cmp/Hyph+Cmp#stáhta+v1+N+Der/lasj+A+Sg+Nom ii-stáhtalaš ii-+Err/Orthstáhta+Err/Orth+N+Der/lasj+A+Attr ii-stáhtalaš ii-+Err/Orthstáhta+Err/Orth+N+Der/lasj+A+Sg+Nom ii-stáhtalaš ii-+Err/Orthstáhta+v1+N+Der/lasj+A+Attr ii-stáhtalaš ii-+Err/Orthstáhta+v1+N+Der/lasj+A+Sg+Nom ii-stáhtalaš ii-+Err/Orthstáhtalaš+A+Attr ii-stáhtalaš ii-+Err/Orthstáhtalaš+A+Sg+Nom

albbas commented 8 years ago

Comment 11567

Date: 2016-10-17 13:00:36 +0200 From: Lene Antonsen <>

Setter Ciprian som CC fordi dette har konsekvenser for korpus

albbas commented 8 years ago

Comment 11569

Date: 2016-10-17 14:16:00 +0200 From: Sjur Nørstebø Moshagen <>

Dette er det eg får med den nye tokeniseringa:

$ echo "ii-stáhtalaš" |hfst-tokenise --giella-cg tools/preprocess/tokeniser-disamb-gt-desc.pmhfst "<ii-stáhtalaš>" "ii-" Err/Orth "stáhta" Err/Orth NN Sem/Org Der/lasj A Attr "ii-" Err/Orth "stáhta" Err/Orth NN Sem/Org Der/lasj A Sg Nom "ii-" Err/Orth "stáhta" NN Sem/Org Der/lasj A Attr "ii-" Err/Orth "stáhta" NN Sem/Org Der/lasj A Sg Nom "ii-" Err/Orth "stáhtalaš" A Sem/Dummytag Attr "ii-" Err/Orth "stáhtalaš" A Sem/Dummytag Sg Nom "stáhta" Err/Orth NN Sem/Org Der/lasj A Attr "ii-" N Err/HyphSub Sem/Dummytag Cmp/SgNom Cmp/Hyph Cmp "stáhta" Err/Orth NN Sem/Org Der/lasj A Sg Nom "ii-" N Err/HyphSub Sem/Dummytag Cmp/SgNom Cmp/Hyph Cmp "stáhta" NN Sem/Org Der/lasj A Attr "ii-" N Err/HyphSub Sem/Dummytag Cmp/SgNom Cmp/Hyph Cmp "stáhta" NN Sem/Org Der/lasj A Sg Nom "ii-" N Err/HyphSub Sem/Dummytag Cmp/SgNom Cmp/Hyph Cmp :\n

Og det ser jo ikkje bra ut. Men eg forstår ikkje heilt på kva slags måte vi kan unngå å få Err/Orth midt i lemma-strengen, fordi det nettopp er bruken av ii- som er problematisk.

albbas commented 8 years ago

Comment 11570

Date: 2016-10-17 18:43:05 +0200 From: Trond Trosterud <>

Kan vi ikkje bruke taggen +Err/Orth+ for prefiks? Altså + til slutt, jf.

echo "ii-stáhtalaš ii-+Err/Orth+stáhtalaš+A+Sg+Nom"|lookup2cg "<ii-stáhtalaš>" "ii-" Err/Orth stáhtalaš A Sg Nom

albbas commented 7 years ago

Comment 11851

Date: 2016-12-14 23:07:43 +0100 From: Lene Antonsen <>

Jeg kommenterte ut stien fra compounds-fila, og no har vi denne stien fra nouns: ii-stáhtalaš ii-+N+Err/Lex+Cmp/SgNom+Cmp/Hyph+Cmp#stáhta+v1+N+Der/lasj+A+Attr

Jeg foreslår å endre +N til +V

sme$ echo ii-stáhtalaš | usmedis | lookup2cg "<ii-stáhtalaš>" "ii-#stáhta" NN Sem/Org Der/lasj A Attr "ii-#stáhta" NN Sem/Org Der/lasj A Sg Nom

albbas commented 7 years ago

Comment 11874

Date: 2016-12-17 21:57:06 +0100 From: Trond Trosterud <>

Det er det for så vidt eit godt argument for (ii = +V)

albbas commented 7 years ago

Comment 11889

Date: 2016-12-19 08:43:25 +0100 From: Thomas Omma <>

it is good to do that? to: endre +N til +V