dialogos-project / dialogos

The DialogOS dialog system.
https://www.dialogos.app
GNU General Public License v3.0
21 stars 8 forks source link

Pronunciation dictionary duplicated on save. #165

Closed kastein closed 5 years ago

kastein commented 5 years ago

We tried to add the word "Einkaufsliste" and Einkaufszettel" to the pronunciation dictionary. However, each time we say one of those two words we get this error message:

null_pointer_einkaufszettel

If we use the silent mode it works so it seems not to be a problem of our grammar.

We wrote the entries like this:

ausspracheworterbuch

When we opened the file the next time each of the words had several equivalent entries in the pronunciation dictionary. It seems as everytime we close the file and open it again we get two times more entries. After closing and opening it 4 times we had 8 entries for each word.

vervielfaltigung

But unfortunately that seems not to happen regulary so we cannot really reproduce the error.

alexanderkoller commented 5 years ago

Could you post a minimal example dialog that allows us to check this?

@timobaumann could you look into it?

kastein commented 5 years ago

The minimal example includes one entry for "Einkaufszettel" in the pronunciation dictionary and the speech recognizer node should only recognize "Einkaufszettel". It reproduces the error message mentioned above and on my machine it also "worked" that after closing the file and opening it again there were two entrys for "Einkaufszettel"

I use Dialogos Version 2.0.5

aussprachewörterbuch.zip

timobaumann commented 5 years ago

this is related to https://github.com/dialogos-project/dialogos/issues/121. It works nicely for aI n k aU f s t s e: t @ l and fails with E+. I'll attempt a fix later.

@kastein, please note that the file you sent actually contains <entry g="Einkaufszettel" p="aI n k aU f s t s t s E+ t @ l@"/>. The final @ that is appended (without space) to l also breaks things. Notice that there is presently no checking of your g2p entries -- if you mess up, you mess up. (That's no excuse for E+ not working, obviously.)

@kastein: can you please check if the duplication-on-save problem persists for entries that work (e,g,, drölf→d r 9 l f)? If so that's probably a different bug.

alexanderkoller commented 5 years ago

Note also that "Einkaufszettel" is not pronounced "Einkaufs-z-zettel". :)

kastein commented 5 years ago

Thank you very much. It now works.

However, the multiplication problem seems still to persist, also for drölf.

aussprachewörterbuch (3).zip

timobaumann commented 5 years ago

I've fixed the issue with +s in 2cf9e3a.

timobaumann commented 5 years ago

Duplication happens on load not on save (but will later of course be saved). Whenever a model is loaded, its G2P-exceptions are added to the plugin's list of exceptions. There's no duplicate checking thus repeatedly loading/saving models yields many duplicates. This could be fixed with checking whether entries are already present in the list before adding.

However, when loading multiple different models, they will "cross-contaminate" each other's G2P-exception lists. To the point that one could imagine a "virus"-model that sets the pronunciation of every word (potentially used by other models) to duck. Duck.

Any thoughts?

timobaumann commented 5 years ago

this is now dealt with in #169.