giellalt / bugzilla-dummy

0 stars 0 forks source link

Arabic and roman numerals missing (Bugzilla Bug 435) #544

Open albbas opened 17 years ago

albbas commented 17 years ago

This issue was created automatically with bugzilla2github

Bugzilla Bug 435

Date: 2007-06-15T12:23:11+02:00 From: Thomas Omma <> To: Thomas Omma <> CC: lene.antonsen, sjur.n.moshagen, trond.trosterud

Last updated: 2018-05-29T10:51:23+02:00

albbas commented 17 years ago

Comment 1534

Date: 2007-06-15 12:23:11 +0200 From: Thomas Omma <>

are they of any use? shall we include them from source-files?

albbas commented 17 years ago

Comment 1634

Date: 2007-07-03 08:38:19 +0200 From: Sjur Nørstebø Moshagen <>

I changed the title to better describe the issue.

Pure arabic number strings can be excluded - Word won't spell-check them anyway.

Roman numerals in all forms, and case-inflected arabic numerals should be included, because they arer (number-)letter strings, which will be spell-checked.

Børre has now included (inflected) arabic numerals as part of the speller making, but as shown in the notes for the latest speller, it overgenerates (it allows non-normative case inflection).

This also shows the basic flaw with Børre's approach: the present number inflection does not use the existing transducers, but duplicate the number grammar in a separate piece of code. This again leads to maintenance and consistency problems, as has already happened.

The present solution will have to do for now, but it should be replaced with a solution utilizing the numeral transducers we already have, to ensure we have only one number grammar to maintain, and also for us to be sure that whatever the linguists do, it will be automatically reflected in the speller output. Not to mention that case inflections for numbers varies across languages (SMJ is not the same as SME, which is not the same as SMA).

Roman numerals are not yet included, but should be, at least in upper-case (lower-case is more problematic, since they will conflict with potential spelling errors of regular words: iii as a spelling error of the verb form 'ii').

albbas commented 17 years ago

Comment 1954

Date: 2007-09-26 13:00:53 +0200 From: Sjur Nørstebø Moshagen <>

The arabic numbers have been fixed long ago, but the roman numbers are still open. Here's an example of how to convert between arabic and roman numbers, which could be used as a way to add roman numbers to our spellers in addition to the present arabic. Remember that we need case endings as well!

http://www.xrce.xerox.com/competencies/content-analysis/fsCompiler/fsexamples.html#Roman/Arabic

albbas commented 16 years ago

Comment 2539

Date: 2008-02-01 14:20:58 +0100 From: Sjur Nørstebø Moshagen <>

We don't have any regression test pairs for this one. Could Thomas add a few (un)inflected, upper-case roman numbers to the regression test file?

albbas commented 16 years ago

Comment 2566

Date: 2008-02-04 09:26:42 +0100 From: Thomas Omma <>

yes

albbas commented 7 years ago

Comment 11995

Date: 2017-02-10 15:22:54 +0100 From: Sjur Nørstebø Moshagen <>

Denne er stort sett fiksa i stavekontrollen via abbr, men bør lagast til heilt korrekt slik at vi analyserer romertal når vi ser dei (t.d. for korpusanalyse og grammatikkontroll). Men vi må ta ein grundig diskusjon om moglege problematiske sideeffekter av å leggja dei til.

Endrar andsvarleg til Duommá.