Closed albbas closed 11 years ago
Date: 2012-11-02 10:29:18 +0100
From: Jack Rueter <
Created attachment 146 screen shot of (izh) and (myv) not recognizing letters with uizh and umyv
A change since last night has disabled umyv and uizh
In izh uppercase letters are no longer recognized as variants of lowercase letters.
In myv letters are not recognized.
Attached file: letters-missing_2012-11-02.tiff (image/tiff, 104672 bytes) Description: screen shot of (izh) and (myv) not recognizing letters with uizh and umyv
Date: 2012-11-02 10:39:06 +0100
From: Trond Trosterud <
Cf. also bug #1456, inituppercase, which seems to be related to one of the two problems here.
I have given that bug a lot of attention, and simply were not able to see the difference between izh (initupper working) and fin (initupper not working). Now, at least they are on the same line (neither works).
Date: 2012-11-02 16:26:16 +0100
From: Sjur Nørstebø Moshagen <
I will try to solve this next week. I really don't understand what is going on. Adding Tommi to the Cc list.
Date: 2012-11-03 10:33:24 +0100
From: Jack Rueter <
In izh the initial letter cannot be up-cased. ++ $ uizh 0%>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>100% abstraktsia abstraktsia abstraktsia+N+Sg+Nom
Abstraktsia Abstraktsia Abstraktsia +?
$ hfst-lookup src/analyser-gt-norm.hfst abstraktsia Abstraktsia
++
If, however, I decide to generate random output with the following
:izh jackrueter$ hfst-fst2fst -f openfst-tropical src/analyser-gt-norm.hfst | hfst-compose -F -2 filterNominals.hfst | hfst-fst2strings -r50000 -c1 > nominal_strings04
where filterNominals.hfst is generated from: ? [ %+Nom | %+Part | %+Gen | %+Ine | %+Ill | %+Ela | %+All | %+Ade | %+Abl ] ?
the result includes both upper-case initial and lower-case initial words, whereas the uppercased words are not accepted in uizh.
abstraktsia:abstraktsia+N+Sg+Nom abstraktsiakaa:abstraktsia+N+Sg+Nom+Clt/kAA abstraktsiakii:abstraktsia+N+Sg+Nom+Foc/kii abstraktsiat:abstraktsia+N+Pl+Nom abstraktsiatkii:abstraktsia+N+Pl+Nom+Foc/kii abstraktsiankaa:abstraktsia+N+Sg+Gen+Clt/kAA abstraktsiaskii:abstraktsia+N+Sg+Ine+Foc/kii ... ++
The upper-case is generated according to the same file that is blocked in huizh.
I am also able to generate random forms in Cyrillics in myv. There seem to be problems with @FLAG@ use, as well.
Date: 2012-11-03 17:21:14 +0100
From: Trond Trosterud <
Development here: For izh, the problem is the same initial 0 under LEXICON Root as for fin.
I added (0) to inituppercase.regex, and now get:
Yksi Yksi yks+Num+Card+Sg+Nom
yksi yksi yks+Num+Card+Sg+Nom
But the error reported in the attachment is still with us:
yksiköös yksiköös yksikkö+N+Sg+Ine
Yksiköös Yksiköös Yksiköös +?
But as can be seen from the above, this error is not (alone) linked to the initupper issue.
Date: 2012-11-06 09:36:58 +0100
From: Sjur Nørstebø Moshagen <
Added dependency on bug #1502, as it is quite hard to test possible solutions to this bug without a working build infra.
Date: 2012-11-06 15:45:49 +0100
From: Jack Rueter <
word-initial upper-casing is working in both izh and myv.
This issue was created automatically with bugzilla2github
Bugzilla Bug 1497
Date: 2012-11-02T10:29:18+01:00 From: Jack Rueter <>
To: Sjur Nørstebø Moshagen <>
CC: lene.antonsen, thomas.omma, tommi.pirinen, trond.trosterud
Depends on: #1502 Last updated: 2012-11-06T15:45:49+01:00