speechbrain / benchmarks

This repository contains the SpeechBrain Benchmarks
Apache License 2.0
83 stars 35 forks source link

Tokotron: Tokenized TTS for the SpeechBrain benchmark (single speaker) #37

Closed flexthink closed 1 month ago

poonehmousavi commented 2 months ago

Thanks @flexthink for this PR.. I have some comments regarding formatting:

  1. All recipes should follow the same structure: dataset/task/architecture/. Please follow the same format as in the discriminative tasks in the DASB branch, here.
  2. We should have the same script as discriminative rask for generative tasks. Please make sure, everything is consistent and compatible with the instructions in readme file. For example, the name of the py ad YAML., and the hparams uses for example, data_folder and output_folder, the name of the tokenizer should be "codec" and the same attention layer as in custom_model should be used.
  3. The kmeans repository is transferred to speechbrain/SSL_Quantization, please update it in your recipe.
  4. I also put some comments in the code.
poonehmousavi commented 2 months ago

@flexthink any update on this PR?

poonehmousavi commented 1 month ago

Thanks @flexthink, everything looks really neat .. I did the review... I have fetched the latest update from the DASB main branch... These are the few comments that I have before merging the PR:

  1. There is some recommit error mostly related to unused lib.
  2. The utils and Custim_model should be moved to TTS folder, especially custom_model should be renamed or avoid the link to the main custim_model because the wrappers ae not used for other tasks and it could be confusing,.. we could add your changes later to all tasks, but for now let's keep it only for TTS.
  3. I have added the run_generative_benchmark.sh for SE and SS, please complete it for TTS and make sure that it works.
poonehmousavi commented 1 month ago

Thanks @flexthink for this PR... everything works now..It merged