christian-byrne / audio-separation-nodes-comfyui

Separate stems (vocals, bass, drums, other) from audio. Recombine, tempo match, slice/crop audio
152 stars 11 forks source link

Question: What is the function of the audio file? #6

Closed ShmuelRonen closed 1 month ago

ShmuelRonen commented 1 month ago

Thanks for this great project!

Regarding Stable-Audio. Could you say briefly about the role of the audio file that enters through the VAE audio encoder and connects to the latent input, besides determining the length of the result. Is it like Empty latent or sets additional parameters regarding the audio output?

christian-byrne commented 1 month ago

If you provide an existing song, it's converted into a latent representation, which captures the underlying musical structure. The model should then generate new audio that shares characteristics with the original song, since it used that as its starting point. You can lower the denoise to influence how much the generated output diverges from the original.

To demonstrate, try using an existing song, and setting the denoise value in the Ksampler to something low like 0.2

christian-byrne commented 1 month ago

Closing for now. Feel free to re-open the issue if you'd like to discuss further 👍