leimao / Voice-Converter-CycleGAN

Voice Converter Using CycleGAN and Non-Parallel Data
https://leimao.github.io/project/Voice-Converter-CycleGAN/
MIT License
526 stars 127 forks source link

Mismatched number of frames between F0 (627), spectrogram (628) and aperiodicty (627) #17

Open Likkkez opened 5 years ago

Likkkez commented 5 years ago

Hi! I've tried training the model using my own samples but it gave me assertion error in preprocess.py, so Ive tried changing the n_frames value from 128 to 32 and it seemed to have worked. Also I've changed sampling_rate from 16000 to 44100 since my audio is recorded in 44100 Hz. However, now I get this error while trying to use convert.py

/home/dmitriy/anaconda3/Voice_Converter_CycleGAN-master/preprocess.py:170: RuntimeWarning: divide by zero encountered in log f0_converted = np.exp((np.log(f0) - mean_log_src) / std_log_src * std_log_target + mean_log_target) Traceback (most recent call last): File "convert.py", line 86, in conversion(model_dir = model_dir, model_name = model_name, data_dir = data_dir, conversion_direction = conversion_direction, output_dir = output_dir) File "convert.py", line 58, in conversion wav_transformed = world_speech_synthesis(f0 = f0_converted, decoded_sp = decoded_sp_converted, ap = ap, fs = sampling_rate, frame_period = frame_period) File "/home/dmitriy/anaconda3/Voice_Converter_CycleGAN-master/preprocess.py", line 88, in world_speech_synthesis wav = pyworld.synthesize(f0, decoded_sp, ap, fs, frame_period) File "pyworld/pyworld.pyx", line 425, in pyworld.pyworld.synthesize ValueError: Mismatched number of frames between F0 (627), spectrogram (628) and aperiodicty (627)

Have I messed something up by changing the values described above? Thank you for your time.

adityashas commented 5 years ago

hi.. i have tried the pre training model using your dataset but it gave me this error..please see it. how can we overcome this problem....

I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: Quadro K6000, pci bus id: 0000:0f:00.0, compute capability: 3.5) /home/ai-team/Desktop/Voice_Converter_CycleGAN-master/preprocess.py:170: RuntimeWarning: divide by zero encountered in log f0_converted = np.exp((np.log(f0) - mean_log_src) / std_log_src * std_log_target + mean_log_target)

Pydataman commented 5 years ago

@Likkkez have you solved this problem?

Likkkez commented 5 years ago

@Likkkez have you solved this problem?

Its not a great solution but I just removed one element from the array and it seems to have fixed it

                    decoded_sp_converted1 = world_decode_spectral_envelop(coded_sp = coded_sp_converted, fs = sampling_rate)
                    if sampling_rate==44100:
                        decoded_sp_converted = decoded_sp_converted1[:-(len(decoded_sp_converted1)-len(f0_converted))]
                    else:
                        decoded_sp_converted = decoded_sp_converted1
Pydataman commented 5 years ago

@Likkkez thanks