CSTR-Edinburgh / magphase

MagPhase Vocoder: Speech analysis/synthesis system for TTS and related applications.
Apache License 2.0
78 stars 31 forks source link

Adding magphase to Merlin configuration.py, output dims? #1

Open dreamk73 opened 6 years ago

dreamk73 commented 6 years ago

In the script that extracts features for magphase, it says typically it extracts 60 mag, 45 real, and 45 imag features. I am using 48kHz audio, just like in the script. So are those numbers correct then? I wonder if there are delta or delta-delta features extracted as well? What should I put in configuration.py as the output dimension for these features?

felipeespic commented 6 years ago

Hi, You need to add the deltas and delta deltas for each feature. So:

lf0: 1 dlf0: 3 mag: 60 dmag: 180 real: 45 dreal: 135 imag: 45 dimag: 135

By the way, the complete MagPhase-Merlin integration is coming soon. For now, you can follow the instructions in the MagPhase repo, and that should work.

dreamk73 commented 6 years ago

Thanks. I appreciate you integrating it with Merlin. But I would like to work on trying it out now, if possible. I have worked on integrating other vocoders before, and it is doable.

Thanks to your info, I am now able to train acoustic models. But when I try to generate the waveforms with the script provided in demos/demo_run_for_merlin, I get this error:

ValueError: Dimension provided not compatible with file size.

I use 48000 Hz data and the dimensions to all the features are set as you said. Could it be the framelength feature? Or something else?

felipeespic commented 6 years ago

Hi, which script and line is throwing the error?

dreamk73 commented 6 years ago

When trying to synthesize running 2_batch_wave_generation.py. Here is the full trace: Traceback (most recent call last): File "2_batch_wave_generation.py", line 70, in

Generating wavefile: sn008_sent152................................ lu.run_multithreaded(synthesis, in_feats_dir, l_file_tokns, out_syn_dir, nbins_mel, nbins_phase, mvf, fs, fft_len, b_postfilter) File "/home/esther/merlin/tools/bin/magphase/src/libutils.py", line 61, in run_multithreaded

Generating wavefile: sn008_sent156................................ Generating wavefile: sn008_sent154................................ results = pool.map(func_wrapper, l_iterable_args) File "/usr/local/anaconda/lib/python2.7/multiprocessing/pool.py", line 251, in map

Generating wavefile: sn008_sent158................................ return self.map_async(func, iterable, chunksize).get() File "/usr/local/anaconda/lib/python2.7/multiprocessing/pool.py", line 567, in get

Generating wavefile: sn008_sent160................................

Generating wavefile: sn008_sent162................................

Generating wavefile: sn008_sent164................................

Generating wavefile: sn008_sent166................................

Generating wavefile: sn008_sent168................................ raise self._value ValueError: Dimension provided not compatible with file size.

Generating wavefile: sn008_sent170................................

felipeespic commented 6 years ago

It seems that the files that you generated (.mag, .real, or .imag) have a wrong dimension. If you do not know how to check that, you can send some samples to me, so I can check it.

KnowBetterHelps commented 6 years ago

hi, when I run the code with my own data(child voice data),it always get the bug"magphase.py:352: RuntimeWarning: invalid value encountered in divide", is there anything I did wrong?

felipeespic commented 6 years ago

Hi @hyuezhi ,

Just to let you know, if you want to post any new issue or question in GitHub, you need to do it by creating a "New Issue" (button in the top right of the page), not by commenting an issue that is not related with yours :) I created the "new issue" for you.