comfyanonymous / ComfyUI

The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
https://www.comfy.org/
GNU General Public License v3.0
58.52k stars 6.21k forks source link

Saturated audio with Stable Audio #4933

Open Big-Onche opened 2 months ago

Big-Onche commented 2 months ago

Expected Behavior

A clear sound!

Actual Behavior

Music or loud sound effects made with Stable Audio are heavily saturated.

Steps to Reproduce

I tried with every sampler, scheduler, and CFG possible, the issue is still there but I found a way to fix it.

Debug Logs

/

Other

In nodes_audio.py in the class VAEDecodeAudio.

We should normalize the audio when decoding by adding these lines or something similar: max_amplitude = torch.max(torch.abs(audio)) if max_amplitude > 1.0: audio = audio / max_amplitude

This fixes the audio clipping

audiodebug

comfyanonymous commented 2 months ago

https://github.com/comfyanonymous/ComfyUI/commit/56e8f5e4fd0a048811095f44d2147bce48b02457

Does this fix it?

Big-Onche commented 2 months ago

56e8f5e

Does this fix it?

It's better but there is still some clipping with that fix