marytts / marytts

MARY TTS -- an open-source, multilingual text-to-speech synthesis system written in pure java
https://marytts.github.io/
Other
2.32k stars 734 forks source link

confused while creating alphones.xml file #825

Closed neouyghur closed 6 years ago

neouyghur commented 6 years ago

Hi, I am creating allphones for the Uyghur language, which is very similar to the Turkish language. So I am adapting the Turkish allphones.xml to Uyghur language, however, in Turkish version, I found some characters like @ is very different from the definition of official samba and samba Turkish. Should not we follow the definition of the official samba document? Thanks.

psibre commented 6 years ago

If Uyghur doesn't have a schwa (SAMPA @), then just leave it out of your allophones.ug.xml file.

Regarding the question why the schwa symbol appears in the Turkish allophone definition, I suspect it might have to do with the fact that the canonical SAMPA symbol 1 might have clashed with the Arpabet-style primary stress symbol 1 that was used in some MaryTTS modules at that time.

neouyghur commented 6 years ago

@psibre actully my language includes all of the phones of Turkish and other external 3 phones. So I keep the @ as it is. However, I am wondering can I decide ph variables myself, which are totally different from the samba standard? Thanks.

psibre commented 6 years ago

The ph attributes in the allophone set definition XML should not be "totally different" from the SAMPA notation, which is just a (mostly) ASCII representation for IPA symbols. They are motivated by distinctive features in phonology and ultimately determine various aspects of linguistic processing (such as syllabification) and unit features for the synthesizer.