Pallas1303 / FestPB

FestPB é um projeto com objetivo de oferecer suporte ao Português Brasileiro ao software Text-to-Speech Festival Speech Synthesis. Com opções de baixar pacotes de vozes.
MIT License
8 stars 1 forks source link

all.desc #1

Open ddavout opened 6 months ago

ddavout commented 6 months ago

https://github.com/Pallas1303/FestPB/blob/bf29e3e39b6bf5a488ea0a5aae913b735f35326b/vox_files/all.desc#L126 or ( R:SylStructure.parent.syl_out 0 float 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100) Why keep this kind of list of numbers ? I don't see anything like that in cmu_us_ksp_arctic, for example and in festvox/src/prosody/build_prosody.scm, they remove it with the build_fix_desc_file;

Pallas1303 commented 6 months ago

Because, in hour of building have some erro with wagon. How this "bad values in R:SylStructure.parent.syl_out".

ddavout commented 6 months ago

I had bad values when I was asking for a feature that I don't compute, asking forR:SylStructure.parent.parent.gpos instead of R:SylStructure.parent.parent.pos ( I am using a poslex) with my clunits voice. By the way, I don't have this feature R:SylStructure.parent.syl_out: where your all.desc come from ? If I remember well, I started with unitsel.desc in src/festvox/unitsel.

with my clustergen, I had bad values in the same circumstances, I think it' s may be better to put an "ignore" as the end of the culprit line but I just remove it

Pallas1303 commented 6 months ago

Yes, thies values are of file in https://github.com/festvox/festvox/blob/master/src%2Funitsel%2Funitsel.desc. I modified with new values (numbers) for better unit seletion and with my phones.

This my repository is a experimental, have some erros in my scripts.

ddavout commented 6 months ago

I know too well: make a new voice is difficult. I am actually working on a French voice,
I find interesting to see how you handle things. I miss a forum where we could exchange ideas ! (instead of raising pointless issues..) I don't want to bother you in any way. Delete this message : I will understand the meaning :) May be I should open my own public repository !

Pallas1303 commented 6 months ago

Yes! For my is difficult the part of tokenization (numbers to words) and by not I am have a computer for this.

In a future, have plans for a clustering voice, a pos tagging and others features.

Had a forum for this but it no longer exists It also has but the navigation is much hard.

Pallas1303 commented 6 months ago

We can talk here about our projects if you want, of course.

ddavout commented 6 months ago

right now my tokenizer mechanism (number to words) is out of order, but at the time it was quite sophisticated I was helped the example of the voice ims_german_1.3-os. I borrow them the idea of pattermatch their function similar to patternmatching in PERL.: to give an example , with

  (pattern-matches name
              "{[1-2][0-9][0-9][0-9]}{-\\|/}{[1-2][0-9][0-9][0-9]}")

I could read year spans 2020/2024 I could read numbers associated with units currency, mass, time etc. I also liked their debug strategy with their tokendebuglevel function.

Pallas1303 commented 6 months ago

Interesting, how is your tokenizer?

ddavout commented 5 months ago

In bad shape, it was too ambitious ... and I was refactoring it when I had an accident (physical and material with the lost hard disk with a local repository ... all my fault !) I am recovering and I hope soon my tokenizer will too. But as it is, however degraded it is, it 's "working" ( with plain language, no numbers and no too many abbreviations) Redoing it, it 's quite tedious and I have the project to make a "Flite" voice. Some years ago my daughter made one, not so elaborate but serving her purposes from the utts of my clustergen voice. It will be a challenge as I have no example of a not English voice using a Poslex and a pre_hook function for my lexicon

ddavout commented 5 months ago

if you don't know the Catalan project http://festcat.talp.cat, have a look! particularly upc_catalan_tokenizer but not only...

Pallas1303 commented 5 months ago

what a pity, how good that everything is fine. I am can help your voice?

I am have projects for a voice in Flite. A application for android, using two binary, flite and a tokenization for my language.

Pallas1303 commented 5 months ago

if you don't know the Catalan project http://festcat.talp.cat, have a look! particularly upc_catalan_tokenizer but not only...

https://github.com/FestCat/festival-ca/issues/7

Pallas1303 commented 5 months ago

About clustergen, I am script for generation of F0 with REAPER. This sofware is very good for F0 Extraction. But I not try.

ddavout commented 5 months ago

My daughter send me a link for you https://github.com/RHVoice/RHVoice from there she found the people offering a Brazilian Portuguese voice https://louderpages.org/our-voices/

ddavout commented 5 months ago

if you don't know the Catalan project http://festcat.talp.cat, have a look! particularly upc_catalan_tokenizer but not only...

FestCat/festival-ca#7

I am happy you found this ln interesting, Sergio Oller seems very easy-going For the uts8 matter, I am happily using it for our voice (I can't really why, but I did not like the grapheme stuff

ddavout commented 5 months ago

what a pity, how good that everything is fine. I am can help your voice?

you know what, right now to be sincere, I would be just so pleased if you could told me the name of the function to clear the terminal when using festival in interactive mode. I cant remember it !

Pallas1303 commented 5 months ago

It will be a challenge as I have no example of a not English voice using a Poslex and a pre_hook function for my lexicon

Poslex and pre_hook? I I've never heard of it.

Pallas1303 commented 5 months ago

About UTF-8, have a problem. The Festival not process upper letters with graphic hits

Example: Á Í i o u

For

i o u

Is need created a smail function to convert all upper letters in lower.

Pallas1303 commented 5 months ago

what a pity, how good that everything is fine. I am can help your voice?

you know what, right now to be sincere, I would be just so pleased if you could told me the name of the function to clear the terminal when using festival in interactive mode. I cant remember it !

I not know, sorry. I not use Festival in interactive mode.

ddavout commented 5 months ago

About UTF-8, have a problem. The Festival not process upper letters with graphic hits

Example: Í i o u

For

i o u

Is need created a smail function to convert all upper letters in lower.

yes, you need it. I use `(define (french_downcase_string name) "(french_downcase_string name) Downcase a word and output it as a string"

(if (not (null? name))
  (begin
    ;(debug 100 (format nil "french_downcase_string %s\n" name))
    (set! name (string-replace name "À" "à"))
    (set! name (string-replace name "Á" "á"))
    (set! name (string-replace name "Â" "â"))
    (set! name (string-replace name "Ä" "ä"))
    (set! name (string-replace name "Å" "å"))
    (set! name (string-replace name "Æ" "æ" )) ; ou ae
    (set! name (string-replace name "Ç" "ç"))
    (set! name (string-replace name "È" "è"))
    (set! name (string-replace name "É" "é"))
    (set! name (string-replace name "Ê" "ê"))
    (set! name (string-replace name "Ë" "ë"))
    (set! name (string-replace name "Ì" "ì"))
    (set! name (string-replace name "Í" "í"))
    (set! name (string-replace name "Î" "î"))
    (set! name (string-replace name "Ï" "ï"))
    (set! name (string-replace name "Œ" "œ" )); ou oe
    (set! name (string-replace name "Ò" "ò"))
    (set! name (string-replace name "Ó" "ó"))
    (set! name (string-replace name "Ô" "ô"))
    (set! name (string-replace name "Ö" "ö"))
    (set! name (string-replace name "Ù" "ù"))
    (set! name (string-replace name "Ú" "ú"))
    (set! name (string-replace name "Û" "û"))
    (set! name (string-replace name "Ü" "ü"))
    ; (debug 100 (format nil "until now %s\n" name))
    (set! name  (downcase name))
  )
  ""
))
`
I always forget the name of the festival functions dealing with utf8 : utf8chr      utf8explode  utf8ord
you will need them for example to amend the lts script
ddavout commented 5 months ago

About clustergen, I am script for generation of F0 with REAPER. This sofware is very good for F0 Extraction. But I not try.

I started with the vanilla tools (with or without SPTK), I am quite satisfied with the script make_f0_pm (even without the sophistication it allows). It could be certainly improved but it is not one of my priority. To improve the intonation, a work on phrasing would probably necessary .. And the talent used in the database I use (SIWIS) does not help. Interrogation or exclamation are not marked. (it is difficult in so short prompts) One day my be, I will use the prompts with emphasis. Any way, I did not any digest documentation on how they arrived to the festival/tobi.scm A CART tree for predicting ToBI accents (learned from f2b)
punctuation and minimal pos

ddavout commented 5 months ago

It will be a challenge as I have no example of a not English voice using a Poslex and a pre_hook function for my lexicon

Poslex and pre_hook? I I've never heard of it.

for Poslex have a look on the Festcat project ! To make it simple, It would help to guess statiscally the grammar of your sentence (at least It's how I understand and use it :) I using a lot, and I am not sure I would be able to make a Flite voice taking benefice of it.

Pallas1303 commented 5 months ago

About UTF-8, have a problem. The Festival not process upper letters with graphic hits

Example: Í i o u

For

i o u

Is need created a smail function to convert all upper letters in lower.

yes, you need it. I use `(define (french_downcase_string name) "(french_downcase_string name) Downcase a word and output it as a string"

(if (not (null? name))
  (begin
    ;(debug 100 (format nil "french_downcase_string %s\n" name))
    (set! name (string-replace name "À" "à"))
    (set! name (string-replace name "Á" "á"))
    (set! name (string-replace name "Â" "â"))
    (set! name (string-replace name "Ä" "ä"))
    (set! name (string-replace name "Å" "å"))
    (set! name (string-replace name "Æ" "æ" )) ; ou ae
    (set! name (string-replace name "Ç" "ç"))
    (set! name (string-replace name "È" "è"))
    (set! name (string-replace name "É" "é"))
    (set! name (string-replace name "Ê" "ê"))
    (set! name (string-replace name "Ë" "ë"))
    (set! name (string-replace name "Ì" "ì"))
    (set! name (string-replace name "Í" "í"))
    (set! name (string-replace name "Î" "î"))
    (set! name (string-replace name "Ï" "ï"))
    (set! name (string-replace name "Œ" "œ" )); ou oe
    (set! name (string-replace name "Ò" "ò"))
    (set! name (string-replace name "Ó" "ó"))
    (set! name (string-replace name "Ô" "ô"))
    (set! name (string-replace name "Ö" "ö"))
    (set! name (string-replace name "Ù" "ù"))
    (set! name (string-replace name "Ú" "ú"))
    (set! name (string-replace name "Û" "û"))
    (set! name (string-replace name "Ü" "ü"))
    ; (debug 100 (format nil "until now %s\n" name))
    (set! name  (downcase name))
  )
  ""
))
`
I always forget the name of the festival functions dealing with utf8 : utf8chr      utf8explode  utf8ord
you will need them for example to amend the lts script

Very Good. I can use this in my project with your name?

Pallas1303 commented 5 months ago

About clustergen, I am script for generation of F0 with REAPER. This sofware is very good for F0 Extraction. But I not try.

I started with the vanilla tools (with or without SPTK), I am quite satisfied with the script make_f0_pm (even without the sophistication it allows). It could be certainly improved but it is not one of my priority. To improve the intonation, a work on phrasing would probably necessary .. And the talent used in the database I use (SIWIS) does not help. Interrogation or exclamation are not marked. (it is difficult in so short prompts) One day my be, I will use the prompts with emphasis. Any way, I did not any digest documentation on how they arrived to the festival/tobi.scm A CART tree for predicting ToBI accents (learned from f2b)
punctuation and minimal pos

About phrasing have in https://github.com/festvox/festvox/tree/master/src%2Fphrasyn have a method for this but need a tagset/pos tagging.

And intonation have in https://github.com/festvox/festvox/tree/master/src%2Fspamf0 but I not try.

Pallas1303 commented 5 months ago

It will be a challenge as I have no example of a not English voice using a Poslex and a pre_hook function for my lexicon

Poslex and pre_hook? I I've never heard of it.

for Poslex have a look on the Festcat project ! To make it simple, It would help to guess statiscally the grammar of your sentence (at least It's how I understand and use it :) I using a lot, and I am not sure I would be able to make a Flite voice taking benefice of it.

I understood. But not know this help in synthesis.

Pallas1303 commented 4 months ago

Hello! @ddavout. How are you?

ddavout commented 3 months ago

already 3 weeks ! I was very busy, I've decided not to wait to have a clean project .. I made 2 repos on github One is yet too dirty: I keep it private, this is the VOX part and the other one the LANG is taking shape but it is yet very franco-français In the LANG part I use what I call a model voice, so it is possible to generate a working French voice, a Clustergen I am using for this, very few prompts (150 !) and I did not choose them very carefully, so we can hear crispy noise depending on the utterance. I will change that, I need to test the generated waves, as it is done in a vanilla festvox clustergen I talk, I talk, not even asking how you are doing !

ddavout commented 3 months ago

Quanto ao francês, consegui construir um LTS robusto em um computador de aproximadamente dez anos com 2 núcleos e 8 GB de memória. Meu alfabeto

(set! alphabet (list "_"
"a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s" "t" "u" "v" "w" "x" "y" "z" "à" "á" "â" "â" "ä" "å" "æ" "é" "ë" "è" "ê" "ì" "í" "î" "ï" "ò" "ô" "ö" "ú" "ü" "ù" "ç"))

necessita de uma codificação em utf8, pouco compatível com lib/lts.cm do Festival. Meu repositório não é de qualidade profissional, mas https://ddavout.github.io/FESTIfr/#Construction%20de%20INST_LANG_lts_rules.scm uma vez traduzido para o seu idioma, pode talvez lhe inspirar.

Pallas1303 commented 3 months ago

Quanto ao francês, consegui construir um LTS robusto em um computador de aproximadamente dez anos com 2 núcleos e 8 GB de memória. Meu alfabeto

(set! alphabet (list "_"
"a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s" "t" "u" "v" "w" "x" "y" "z" "à" "á" "â" "â" "ä" "å" "æ" "é" "ë" "è" "ê" "ì" "í" "î" "ï" "ò" "ô" "ö" "ú" "ü" "ù" "ç"))

necessita de uma codificação em utf8, pouco compatível com lib/lts.cm do Festival. Meu repositório não é de qualidade profissional, mas https://ddavout.github.io/FESTIfr/#Construction%20de%20INST_LANG_lts_rules.scm uma vez traduzido para o seu idioma, pode talvez lhe inspirar.

Oi, fico feliz que esteja falando no meu idioma nativo. Eu tinha visto o seu repositório. Apesar que está meio confuso para mim, mesmo assim gostei bastaste.

Sobre o LTS, o que eu uso foi treinado usando 68000 palavras sendo nomes. Mas mesmo assim não achei bons os resultados. Hoje (18/05) eu consegui modificar a minha função LTS para usar um G2P (Grafeme to Phonemes) externo que é acessiveu usando linha de comando. Teve problemas para executar a linha de comando pois as palavras com acento grafico estavam com codificação errada.

Eu tenho um laptop com 4gb de ram com 2 núcleos. Estou trabalhando o projeto nele.

Eu ouvi a sua amosta do seu repositório, fico "boa" pelo estagio inicial. Tentou olhar nos scripts build_cg_rfs_voice? Tem métodos para tentar dar uma melhorar na voz clustering.

Pallas1303 commented 3 months ago

already 3 weeks ! I was very busy, I've decided not to wait to have a clean project .. I made 2 repos on github One is yet too dirty: I keep it private, this is the VOX part and the other one the LANG is taking shape but it is yet very franco-français In the LANG part I use what I call a model voice, so it is possible to generate a working French voice, a Clustergen I am using for this, very few prompts (150 !) and I did not choose them very carefully, so we can hear crispy noise depending on the utterance. I will change that, I need to test the generated waves, as it is done in a vanilla festvox clustergen I talk, I talk, not even asking how you are doing !

Para 150 prompts é muito bom. Posso fornecer script de extração de F0 usado REAPER. Ainda não tentei em uma voz clustering real.

ddavout commented 3 months ago

No seu script LTS para grafemas, você usa a função wordexplode? Veja meu INST_LANG_lts.scm, ele é compatível com UTF-8. .... Acabei de olhar seu repositório, você usa o build_lts do FESTVOX. A partir de _cummulate_pair, você perde as letras acentuadas. Veja meu Tiddly aqui. Eu não reescrevi o script, mas isso seria possível. Eu até propus isso como good first issue :). Eu especialmente mudei o lts_build.scm.

Pessoalmente, eu uso menos de 50.000 entradas, e nem todas são únicas, pois para influenciar o wagon, não hesitei em repetir algumas.

Pallas1303 commented 3 months ago

No seu script LTS para grafemas, você usa a função wordexplode? Veja meu INST_LANG_lts.scm, ele é compatível com UTF-8. .... Acabei de olhar seu repositório, você usa o build_lts do FESTVOX. A partir de _cummulate_pair, você perde as letras acentuadas. Veja meu Tiddly aqui. Eu não reescrevi o script, mas isso seria possível. Eu até propus isso como good first issue :). Eu especialmente mudei o lts_build.scm.

Pessoalmente, eu uso menos de 50.000 entradas, e nem todas são únicas, pois para influenciar o wagon, não hesitei em repetir algumas.

As letras acentuadas funcionam normalmente aqui. Mesmo com meu modelo LTS ainda precida de um dicionario.

Pallas1303 commented 3 months ago

@ddavout Eu mudei o método de Unit Selection (Clunits) para o Clustering (HMM) que oferecer maior estabilidade na síntese.

Amostras: https://drive.google.com/drive/folders/13L8zVF2gsYzaC38jK3UyH2thrFpO23or?usp=drive_link Usei 1000 (todos) áudios da voz que tenho a sua versão Clunits.

Ah, sobre a questão discutida lá em cima da falha da função (downcase) em transformar as palavras maiúsculas em minusculas. Eu adicionei um código em minha função ```scheme (define (festpb_pt_lts_function word features)


Eu usei um comando ```tr``` para converter maiúsculas em minusculas. 

```scheme
  (define (festpb_pt_lts_function word features)
      "(festpb_pt_lts_function WORD FEATURES)
Return pronunciation of word not in lexicon."
      (let (tmpword (print word) (phones) (syls) (aphones))
            ;(set! phones (lts_predict (utf8explode dword) festpb_pt_lts_rules))
            (set! tmpfile (make_tmp_filename))
            (format t "%s\n" word) ;; Debug
            (set! tr_word (format nil "echo %s | tr [:upper:] [:lower:] > %s" word tmpfile))
            (system tr_word)

          (let ((fd (fopen tmpfile  "r")))
              (set! dword (readfp fd))
              (fclose fd)
          )
           (format t "%s\n" dword) ;; Debug
           (delete-file tmpfile)

A variável tmpword só adicionei ela porque não podia em usar diretamente a variável word na função

ddavout commented 3 months ago

Eu usei um comando tr para converter maiúsculas em minusculas.

doesn't work !

echo "Çà" | tr [:upper:] [:lower:] > ici_çà; grep 'Ç' ici_çà

https://github.com/ddavout/FESTIfr/discussions/19 (I 'm trying to improve my git skill :)

Pallas1303 commented 3 months ago

Eu usei um comando tr para converter maiúsculas em minusculas.

doesn't work !

echo "Çà" | tr [:upper:] [:lower:] > ici_çà; grep 'Ç' ici_çà

https://github.com/ddavout/FESTIfr/discussions/19 (I 'm trying to improve my git skill :)

Eu comentei em sua discursão sobre uma possível solução.

ddavout commented 3 months ago

About UTF-8, have a problem. The Festival not process upper letters with graphic hits Example: Í i o u For i o u Is need created a smail function to convert all upper letters in lower.

yes, you need it. I use `(define (french_downcase_string name) "(french_downcase_string name) Downcase a word and output it as a string"

(if (not (null? name))
  (begin
    ;(debug 100 (format nil "french_downcase_string %s\n" name))
    (set! name (string-replace name "À" "à"))
....
    (set! name (string-replace name "Û" "û"))
    (set! name (string-replace name "Ü" "ü"))
    ; (debug 100 (format nil "until now %s\n" name))
    (set! name  (downcase name))
  )
  ""
))
`

Hi Pallas

In fact the string-replace from freebsoft-utils I use is not fully utf8-compatible :-1:

"ù" is not replaced properly in (string-replace name "ù" "u")) UTF-8 bytes as Latin-1 characters is what you typically see when you display a UTF-8 file with a terminal or editor that only knows about 8-bit characters.
for "ù" it's " Ã ¹ " (UTF-8 Tool) my token can see the character "¹"

Pallas13 commented 1 month ago

Olá, eu perdi o meu acesso ao meu Github :(. Tiver que criar uma nova conta. Estou tento alguns problemas, vai demorar um pouco para eu conseguir voltar a ativa no projeto.