Open paulovasconcellos-hotmart opened 7 months ago
Hi @paulovasconcellos-hotmart
symbol table is essentially a set of all unique phones in your dataset.
You can take a look how to create a unique_text_tokens.k2symbols
in my WIP branch for training Text 2 Semantic model for Ukrainian language.
Good luck with your experiments!
Hello everyone, I've noticed that throughout the pipeline, unknown tokens are removed, and that the
unique_text_tokens.k2symbols
doesn't contém all necessary phonemes for Non-English languages, such as accents and other diacritics.I'm training to train pheme in Portuguese, and I was wondering what I should do so the model can understand the accents of my language. Any tips on how to do it?
P.S.: I've also changed the phonemizer backend, so it could generate phonemes in PT-BR.
espeak
is available in PT-BR, so it was a no-brainer.