thunderdrop / IBMTTSDictionaries

A large, community-driven pronunciation dictionary for the IBMTTS speech synthesizer in American English
Creative Commons Zero v1.0 Universal
23 stars 10 forks source link

Wildcard and matching in the .dic files #45

Open Finnboy94 opened 1 year ago

Finnboy94 commented 1 year ago

Hello, I have started creating a community dictionary for Finnish locally. Is it possible to use matching, e.g. for all words starting with "salasan" IBMTTS would do whatever I set it to do for the salasan part and then add the rest of the word.

amirsol81 commented 1 year ago

@Finnboy94 First and foremost, good to hear that you've started such a project! As for your question, I'd suggest testing it. Since I don't know Finnish, I can't tell if it does what you want. However, you can test it by creating an entry in your Finnish Root file and checking if the entry affects its derrivated/inflected forms. This is also what we do for US English.

Finnboy94 commented 1 year ago

Hello, hmm, well I have an entry like this: salasana ^[.0sala.1sana] and that works, but I'd like to have an entry that matches all occurrences of, say, salasan, replaces them and then adds the rest of the word. With REGEXP I could do something like: ^salasan ^[.0sala.1san] but at least that doesn't work. Weird...