Closed switchzts closed 6 years ago
There are two ways. First, train more iterations without phase loss. I continue the training after upload the generated sample and the quality got a bit better. I'll upload it after finish training.
Second, add phase loss. Please check this paper. They use DNN for phase reconstruction from magnitude spectrogram. So the loss function gives a good suggestion.
@dhgrs ok~I just find that there is no different between 350k-gen-wav ande 210k-gen-wav,both of them had some electrical sounds exist in the audio.Should I change learning rate or something? BTW, my batch-size is 2
I tried changing scale
from magnitude
to log
after 500k iterations. And I uploaded new generated audio samples yesterday so please listen.
But still a bit noisy. SING, FAIR's NIPS 2018 paper also trains wave generator by spectrogram loss. So it would help us. https://research.fb.com/publications/sing-symbol-to-instrument-neural-generator/
@switchzts And check this paper uploaded to arXiv yesterday. Same concept, using spectral loss and phase loss. The difference is network architecture. They use LSTM. https://arxiv.org/abs/1810.11945
@dhgrs This paper looks like a new method of vocoder,and its demo sounds like similar to the WN!It has provide code,I will try it first,Thanks
Some electrical sounds exist in the generated audio, which greatly affects the sense of hearing. In the original text, there is some phase loss added to the loss. Do you have any thoughts of it?