Open The-Mr-L opened 2 years ago
What kind of example have you tested? (maybe either TTS
or VC
)
sorry yes tts.
I just download some of the samples and replaced them with the existing samples for the eng female. but this might be too naive?
Could you take the command below?
# Argument 1 => String to be written
# Argument 2 => Speaker Assets (directory)
cargo run --example tts -- "Hello Mr my yesterday" "./assets/samples/speaker_woman_english/"
And, please be note that if you replace the samples in the specific directory, then the original ones should be removed.
that is what I have done. and I ofc removed the original samples. the code I ran is
pub fn speak(text: &str) -> Result<()> {
// Load a Model
let model_path = "./models/tts/vits";
let speaker_encoder_path = "./models/tts/speaker_encoder.pt";
let tts = rustts::TTS::try_default(model_path, speaker_encoder_path)?;
// Command-line Arguments
let speaker_emb = "./models/tts/samples/speaker_woman_english";
// Parameters
let text = text;
let speaker_emb = tts.embed(&rustts::utils::audio::get_wav_files(&speaker_emb)?)?;
let options = SynthesisOptions {
length_scale: 1.0,
..Default::default()
};
// Forward
let ref_wav_voc = tts.synthesis(&text, &speaker_emb, &options)?;
// Save to .wav file
rustts::utils::audio::save_wav(ref_wav_voc, "output-tts.wav")?;
Ok(())
}
btw the model etc is just copied from the asset folder
cargo run --example tts -- "Hello Mr my yesterday" "./assets/samples/speaker_woman_english/" just gave the same male voice. when using the new samples from the sample page above, and they are female ofc.
so I just tried with some samples from https://github.com/edresson/yourtts and it works, so it is properly the format of the wav file that is not right. don't know yet.
Thanks for the feedback! I'll leave the progress after experimenting in several different clean environments.
Sorry to keep you waiting so long.
As I tested, the command below succeeded in TTS operation with a female voice without any problem.
cargo run --example tts -- "Hello Mr my yesterday" "./assets/samples/speaker_woman_english/"
So, unfortunately I haven't been able to reproduce your problem. If you have some time, could you please execute the commands in the order below?
git clone https://github.com/ulagbulag-village/rustts.git
cd rustts
cargo run --example tts -- "Hello Mr my yesterday" "./assets/samples/speaker_woman_english/"
np :) well the example works , but as I said if I replace the speaker samples in the example some other female wav samples then I got male voice. that said as I mentioned in the last comment I get it working when I used samples from https://github.com/edresson/yourtts .
but https://erogol.github.io/ddc-samples/ doss not work . might be the format I am not sure.
We currently only support wav files with 16,000 sample rate and 16 BPS on a single channel.
Could you mind sending me some wav samples to debug? It seems difficult to understand the specific situation since I'm not looking at the actual samples. ( via download link on public, or e-mail me )
well sure all ssamples I was using are at the link above :) sample like this one https://erogol.github.io/ddc-samples/wavs/s3.wav
well this might be a stupid question but why do I get a male voice when I replace the english woman's samples with the one from Coqui TTS - Double Decoder Consistency v2 Samples. just the samples downloaded from the sample page here https://erogol.github.io/ddc-samples/
:)