ben-hayes / neural-waveshaping-synthesis

efficient neural audio synthesis in the waveform domain
Mozilla Public License 2.0
185 stars 14 forks source link

Gradio Demo #5

Open AK391 opened 3 years ago

AK391 commented 3 years ago

web demo on Gradio Hub

ben-hayes commented 3 years ago

Hi @AK391, this is really cool! ​Thanks for putting this together so quickly. Two thoughts:

  1. As much of the computation time in the gradio demo is being spent on pitch extraction, I think it might be giving the wrong impression about the efficiency of the synthesis model — especially as it's not clear from the interface what computation is taking place. I would suggest maybe trying the extract_f0_with_pyin function in lieu of CREPE: https://github.com/ben-hayes/neural-waveshaping-synthesis/blob/cb7aaea0412c9dc0f46004e2f1656217f065d333/neural_waveshaping_synthesis/data/utils/f0_extraction.py#L61-L70
  2. I'd like to limit the content of the repo to only what is necessary to reproduce results from the paper/online supplement. With that in mind, whilst I love the demo and I'm very grateful to you for creating it, I'm not sure it makes sense to merge this to the main repo. If you're happy to host this on your account, I'd love to add a link to the demo/code to the readme, along with a credit for you. Let me know what you think of this.

Best

Ben

AK391 commented 3 years ago

@ben-hayes thanks, when trying extract_f0_with_pyin getting this error

Traceback (most recent call last): File "/usr/local/lib/python3.7/dist-packages/flask/app.py", line 2447, in wsgi_app response = self.full_dispatch_request() File "/usr/local/lib/python3.7/dist-packages/flask/app.py", line 1952, in full_dispatch_request rv = self.handle_user_exception(e) File "/usr/local/lib/python3.7/dist-packages/flask_cors/extension.py", line 165, in wrapped_function return cors_after_request(app.make_response(f(args, kwargs))) File "/usr/local/lib/python3.7/dist-packages/flask/app.py", line 1821, in handle_user_exception reraise(exc_type, exc_value, tb) File "/usr/local/lib/python3.7/dist-packages/flask/_compat.py", line 39, in reraise raise value File "/usr/local/lib/python3.7/dist-packages/flask/app.py", line 1950, in full_dispatch_request rv = self.dispatch_request() File "/usr/local/lib/python3.7/dist-packages/flask/app.py", line 1936, in dispatch_request return self.view_functions[rule.endpoint](req.view_args) File "/usr/local/lib/python3.7/dist-packages/gradio/networking.py", line 91, in wrapper return func(args, *kwargs) File "/usr/local/lib/python3.7/dist-packages/gradio/networking.py", line 179, in predict prediction, durations = app.interface.process(raw_input) File "/usr/local/lib/python3.7/dist-packages/gradio/interface.py", line 320, in process predictions, durations = self.run_prediction(processed_input, return_duration=True) File "/usr/local/lib/python3.7/dist-packages/gradio/interface.py", line 298, in run_prediction raise exception File "/usr/local/lib/python3.7/dist-packages/gradio/interface.py", line 293, in run_prediction prediction = predict_fn(processed_input) File "", line 87, in inference loudness_filtered = loudness * (confidence > loudness_conf_filter) ValueError: operands could not be broadcast together with shapes (2501,) (320000,)

 with torch.no_grad():
        f0, confidence = extract_f0_with_pyin(
            audio,
            sample_rate=float(rate),
            maximum_frequency=1000
            )
AK391 commented 3 years ago

@ben-hayes also adding the extract_f0_with_pyin method to the colab as a option as well as crepe