-
I have another problem that I try to match tacotron2 [https://github.com/begeekmyfriend/tacotron2](url) ,but the generated audio only have noise.
The TTS params is already match diffwave, i found th…
-
Hi.
Thankyou for your great work.
I believe, there can be a better way to compare versions of torchaudio. Instead of comparing strings, we can use packages like [packaging](https://pypi.org/proje…
-
I follow the paper setting, using a batch size of 16 and a audio length of 16000. But 1 GPU is enough for me to train the model. Why does the original paper use 8 GPUs?
-
微博内容精选
-
Hi, I'm trying to replicate your results for applying SaShiMi in a diffusion context, and have run into some questions about implementation details along the way. It'd be awesome if you could help me …
-
I'm using a fork of https://github.com/Tomiinek/Multilingual_Text_to_Speech as the project https://github.com/CherokeeLanguage/Cherokee-TTS.
The TTS project I'm using shows the below for audio para…
-
I'm doing music related research, and mel-spectrogram doesn't seem to be the best data representation for the task I'm handling with, so I'm considering switching to CQT.
I trained DiffWave on music …
-
Hey, I wanted to give S4D a quick try in my research as a drop-in replacement of S4 (which, as far as I gathered, should be a good way to start), but I'm running into some hard memory limitations. I'm…
-
Hello,
I'm considering trying your Sashimi model as the backbone of a diffusion model for audio generation. There is a detail I couldn't find in the paper, neither in the code (maybe I didn't look …
-
Hi. I noticed that the paper "DiffWave: a versatile diffusion model for speech synthesis" has wrong author list and conference information. Also, there is no paper titled "DiffWave with Continuous-tim…