magenta / ddsp

DDSP: Differentiable Digital Signal Processing
https://magenta.tensorflow.org/ddsp
Apache License 2.0
2.86k stars 332 forks source link

Unsupervised Learning F0 #318

Closed ericwudayi closed 3 years ago

ericwudayi commented 3 years ago

First, thanks for this awesome work!! I am trying to reproduce the result of this paper, for "supervised" F0 experiment, It works very well! However, for unsupervised one, the model seems cannot learn well. (Far from the true pitch, and the resynthesis audio is not good.)

Because the unsupervised learning model seems not wrote in here. I follow the encoder of "ResnetSinusoidalEncoder", where I set n_mels=80 in audio-->melspec. Am I doing anything non-sense? Or I should adopt the ICML 2020 method to get the correct pitch.

Thank u very much.

ketan0 commented 3 years ago

@ericwudayi I could be wrong, but I don't think the ResnetSinusoidalEncoder is the same Resnet that they used in the paper. It seems to be intended to map from audio to synthesizer params, not from audio to F0.

@magenta I would also appreciate if the architecture + gin configuration for the (fully) unsupervised autoencoder could be added to the repo 🙂

hubertsiuzdak commented 3 years ago

The F0 encoder was removed in 6deb3e35. You can browse for the files in the repo's history or just checkout v0.0.6 tag (ICLR 2020 release). The config you're looking for is named ae_abs.gin.

Just note, the authors consider it to be deprecated.

ketan0 commented 3 years ago

Ah, I see - thanks for that! Do you know why the authors consider it deprecated? Is there a better version?

jesseengel commented 3 years ago

Sorry for the long delay. The original ae_abs model never really was good at actually predicting the pitch, but would recreate the audio (predicting a low pitch and complex harmonic distribution, in essence reinventing a STFT). The follow up work can be found here: https://openreview.net/forum?id=RlVTYWhsky7, with demo here: https://github.com/magenta/ddsp/blob/main/ddsp/colab/demos/pitch_detection.ipynb