cmusphinx / sphinxbase

Other
530 stars 272 forks source link

JSGF/FSG model build/load log weight incompatibility #79

Open lv-gh opened 4 years ago

lv-gh commented 4 years ago

test.gram:

JSGF V1.0 UTF-8;

grammar test; public \<test> = hello /10/ world;

jsgf weights > 1.0 are OK (and i suppose are pretty valid) to jsgf_build_fsg_internal() and that grammar can be used for decoding (-jsgf) and/or exported to fsg:

sphinx_jsgf2fsg -jsgf test.gram -fsg test.fsg

FSG_BEGIN NUM_STATES 3 START_STATE 0 FINAL_STATE 2 TRANSITION 0 1 1.000000 hello TRANSITION 1 2 9.999998 world FSG_END

But resulting fsg can't be loaded:

ERROR: "fsg_model.c", line 667: Line[6]: transition spec malformed; Expecting float as transition probability

because fsg_model_read() requires all log weights/probabilities to be <= 1.0:

nshmyrev commented 4 years ago

Thanks for reporting. I would consider it a minor issue.

lv-gh commented 4 years ago

Well, it is a minor issue, but without modifying source code i can't use sphinx in this scenario: jsgf -> fsm -> OpenFst minimization (log arcs/weights) -> fsm2fsg.

nshmyrev commented 4 years ago

Don't use jsgf, just build the language models. See also for details:

https://github.com/alphacep/vosk-api/issues/55

lv-gh commented 4 years ago

Thanks. Experiments shows that FSG works better. Besides it's a specific domain (valid/known address dictation; not much of a freedom to spell address differently that statistical language models would benefit)