giellalt / bugzilla-dummy

0 stars 0 forks source link

hfst does not read the whole sma sourcecode (Bugzilla Bug 1113) #1339

Closed albbas closed 12 years ago

albbas commented 13 years ago

This issue was created automatically with bugzilla2github

Bugzilla Bug 1113

Date: 2011-08-24T08:58:38+02:00 From: Trond Trosterud <> To: Sjur Nørstebø Moshagen <> CC: ftyers

Last updated: 2011-09-29T08:46:58+02:00

albbas commented 13 years ago

Comment 4899

Date: 2011-08-24 08:58:38 +0200 From: Trond Trosterud <>

Here:

===> South Sámi <===

Building sma-lexc.hfst

hfst-lexc -v -f openfst-tropical -o sma/bin/sma-lexc.hfst sma/src/sma-lex.txt sma/src/verb-sma-lex.txt sma/src/pp-sma-lex.txt sma/src/pronoun-sma-lex.txt sma/src/interjection-sma-lex.txt sma/src/conjunction-sma-lex.txt sma/src/subjunction-sma-lex.txt sma/src/particle-sma-lex.txt sma/src/noun-sma-lex.txt sma/src/numeral-sma-lex.txt sma/src/adj-sma-lex.txt sma/src/adv-sma-lex.txt sma/src/punct-sma-lex.txt sma/src/abbr-sma-lex.txt sma/src/acro-sma-lex.txt sma/src/propernoun-sma-lex.txt sma/src/propernoun-sma-morph.txt running /usr/local/bin/foma -f sma/bin/sma-lexc.hfst.tmp.lexcscript Root...15, R...10, RHyph...7, N_ODD...7, N_ODD_LOAN...9, N_ODD_NODISIMP...30, N_ODD_SG...16, N_ODD_PL...10, N_ODD_ESS...1, ÅABPETJH...15, N_ODD_C...21, AAJE (...) TE-plc...2, LAANTE-sur...2, NAME_KONTO...1, ELV-plc...3, SAAMI_CNAME_ODD...4, SAAMI_CNAME_ODD_SG...8, CNAME_ODD-mal...1, CNAME_ODD-fem...1, CNAME_ODD-plc...1, CNAME_ODD-sur...1, CNAME_ODD-obj...1, CNAME_ODD-ani...1, CNAME_ODD-org...1, CNAME_ODD...4, CNAME_ODD_SG...8, CNAME_ODD_PL...8, CNAME_ODD-LOAN...11, CNAME_EVEN-sur...1, CNAME_EVEN-mal...1, CNAME_EVEN-fem...1, CNAME_EVEN-org...1, CNAME_EVEN-plc...1, CNAME_EVEN-obj...1, CNAME_EVEN-ani...1, CNAME_EVEN...1, CNAME_EVENLEX...4, CNAME_EVEN_SG...8, CNAME_EVEN-LOAN...11, SAAMI_CNAME_EVEN...4, SAAMI_CNAME_EVEN_SG...10, DÅERIES...17, TJIRREDS...11 Building lexicon...Warning: lexicon 'TJIRREDS' defined but not used Warning: lexicon 'DÅERIES' defined but not used Warning: lexicon 'SNAME_MAANA_PLUR' defined but not used Warning: lexicon 'JALKEDSEN' defined but not used Warning: lexicon 'indcoll' defined but not used Warning: lexicon 'muvhtiecase' defined but not used *Warning: lexicon 'indeven-a' defined but not used

But TJIRREDS is used:

src$grep TJIRREDS * propernoun-sma-lex.txt:Tjirreds:Tjirrie TJIRREDS "13 TERRTX" ; propernoun-sma-morph.txt:LEXICON TJIRREDS src$

And xerox compiles the same code just fine:

lexc> compose-result No epenthesis. Initial and final word boundaries added. ....................................................................................................................................................................................................Done. 59.2 Mb. 794974 states, 1404824 arcs, Circular. Minimizing...Done.

The error message is exactly the same as the one given for sme.lexc in apertium (although there, according to Francis, the reason might be the size of the lexc file being too big for the python script).

albbas commented 13 years ago

Comment 4900

Date: 2011-08-24 09:06:59 +0200 From: Trond Trosterud <>

I updated foma (cf. Sjur - Måns email discussion), and now got a new error msg, but the reading halted at the same lexicon:

NAME_EVENLEX...4, CNAME_EVEN_SG...8, CNAME_EVEN-LOAN...11, SAAMI_CNAME_EVEN...4, SAAMI_CNAME_EVEN_SG...10, DÅERIES...17, TJIRREDS...11 Building lexicon... Determinizing... Minimizing... Done! 2.5 MB. 80231 states, 160041 arcs, Cyclic. Writing to file sma/bin/sma-lexc.hfst. hfst-name: is not a valid transducer file

Building sma-gen.hfst

hfst-compose-intersect -v sma/bin/sma-lexc.hfst sma/bin/sma-twol.hfst | \ hfst-determinize -v | \ hfst-remove-epsilons -v | \ hfst-minimize -v -o sma/bin/sma-gen.hfst Reading from sma/bin/sma-lexc.hfst and sma/bin/sma-twol.hfst, writing to hfst-compose-intersect: sma/bin/sma-lexc.hfst is not a valid transducer file Reading from , writing to hfst-determinize: is not a valid transducer file Reading from , writing to sma/bin/sma-gen.hfst Reading from , writing to hfst-remove-epsilons: is not a valid transducer file hfst-minimize: is not a valid transducer file make: *** [sma/bin/sma-gen.hfst] Error 1 ~/main/gt$

albbas commented 13 years ago

Comment 4903

Date: 2011-08-24 10:00:05 +0200 From: Trond Trosterud <>

I updated foma (cf. Sjur - Måns email discussion), and now got a new error msg, but the reading halted at the same lexicon:

NAME_EVENLEX...4, CNAME_EVEN_SG...8, CNAME_EVEN-LOAN...11, SAAMI_CNAME_EVEN...4, SAAMI_CNAME_EVEN_SG...10, DÅERIES...17, TJIRREDS...11 Building lexicon... Determinizing... Minimizing... Done! 2.5 MB. 80231 states, 160041 arcs, Cyclic. Writing to file sma/bin/sma-lexc.hfst. hfst-name: is not a valid transducer file

Building sma-gen.hfst

hfst-compose-intersect -v sma/bin/sma-lexc.hfst sma/bin/sma-twol.hfst | \ hfst-determinize -v | \ hfst-remove-epsilons -v | \ hfst-minimize -v -o sma/bin/sma-gen.hfst Reading from sma/bin/sma-lexc.hfst and sma/bin/sma-twol.hfst, writing to hfst-compose-intersect: sma/bin/sma-lexc.hfst is not a valid transducer file Reading from , writing to hfst-determinize: is not a valid transducer file Reading from , writing to sma/bin/sma-gen.hfst Reading from , writing to hfst-remove-epsilons: is not a valid transducer file hfst-minimize: is not a valid transducer file make: *** [sma/bin/sma-gen.hfst] Error 1 ~/main/gt$

albbas commented 12 years ago

Comment 5170

Date: 2011-09-28 16:40:06 +0200 From: Sjur Nørstebø Moshagen <>

Now a month has gone, and a lot of work has been put into hfst compilation. I have no problems compiling sma, and get no errors or warnings from the LexC compilation. Can you still reproduce the problem with the latest svn of:

?

albbas commented 12 years ago

Comment 5175

Date: 2011-09-28 17:48:59 +0200 From: Trond Trosterud <>

No, not the same error msg. But it does not compile either:

~/main/gt$make GTLANG=sma hfst

(...)

Building lookup-optimized sma-norm.hfstol

hfst-determinize -v -i sma/bin/sma-norm.hfst | \ hfst-fst2fst -v -f optimized-lookup-weighted -o sma/bin/sma-norm.hfstol Reading from , writing to sma/bin/sma-norm.hfstol Writing Hfst's lookup optimized, weighted format transducers with HFST3 headers Reading from sma/bin/sma-norm.hfst, writing to Determinizing hfst-invert=(hfst-txt2fst sma/bin/sma.filtered.hfst.att)... Converting hfst-determinize=(hfst-invert=(hfst-txt2fst sma/bin/sma.filtered.hfst.att))... make: *** No rule to make target common/src/Punctuation-filter.regex', needed bysma/bin/sma.speller-filtered.hfst'. Stop. ~/main/gt$

This looks like a minor svn or Makefile omission, though.

Here is my version number:

~/main/gt$hfst-lookup -V hfst-lookup 0.6 (hfst 3.1.1)

albbas commented 12 years ago

Comment 5186

Date: 2011-09-29 08:46:58 +0200 From: Sjur Nørstebø Moshagen <>

(In reply to comment #4)

No, not the same error msg.

So we close it. It was fixed somehow (although we don't know how, or really what the bug was).

But it does not compile either:

~/main/gt$make GTLANG=sma hfst

(...)

make: *** No rule to make target common/src/Punctuation-filter.regex', needed bysma/bin/sma.speller-filtered.hfst'. Stop. ~/main/gt$

This looks like a minor svn or Makefile omission, though.

Yes indeed. This was fixed as well.