-
Hello,
I'm trying to figure out what I need to do so to my numpy array can be vocoded by the UniversalVocoder.
Attached is a sample npy file.
The output is from a modified https://github.com/…
-
Hi Chris
I'm about to recommend my students of Deep Learning for Audio and Music to use Sonic Visualiser to play around with spectrogram parameters like window size and overlap to get an idea of ho…
-
**如果 12 句 finetune 效果不佳,一般是因为数据集太小了,建议增加数据集,一般是 300 ~ 600 条,数据量和质量越好,合成的效果越好**
数据的质量要求没有混响,没有杂音,离麦克风距离适中,具体可以参考标贝的数据质量。
finetune 出来的音色与 目标说话人和原始说话人的相似度有关,即目标说话人和原始说话人相似度越高,finetune 出来的音色更接近目标说话人。
f…
-
Hello Mr. Kumar,
I noticed that you set
```
self.inputFeatDim = 429 ## IMPORTANT: HARDCODED. Change if necessary.
```
I am wondering how can I check the inputFeatDim of my dataset?
Thank…
-
I got aeneas, it works in english and french on 5-minute mp3's but when I give it a full 12 hour one (600 MB mp3, 700 KB text file), it crashes with basically no info. I did -v -l, it sets up the whol…
-
What would be the main steps for building a real-time decoder on top of EESEN?
I read in the EESEN paper that composing the tokens, lexicon and grammar speeds up decoding a great deal, and I'd li…
-
Any idea?
[0409 18:16:28 @parallel.py:193] [MultiProcessPrefetchData] Will fork a dataflow more than one times. This assumes the datapoints are i.i.d.
[0409 18:16:28 @argtools.py:146] WRN "import …
-
A list of dataset files we believe are missing. Will be updated as they're reported / found. Feel free to comment to report additional ones.
- [ ] 108, Waveform Database Generator (Version 2)
V…
-
## Problem statement
I am trying to reproduce the audio feature pre-processing for a longer time-window sequence experiment, but the only available detailed instructions were from #2. However, in t…
-
**Abstract**
Computers can tell us whether we’re happy, sad, angry or any of the several emotions we feel. Computers can understand what we’re saying and answer back. How does all this magic happen? …