riffusion / riffusion-hobby

Stable diffusion for real-time music generation
MIT License
3.36k stars 384 forks source link

Generate a music longer than 5 seconds? #153

Open jiapei100 opened 1 year ago

jiapei100 commented 1 year ago

Does it have a argument for the length of the generated music??

Thanks...

bernland commented 1 year ago

Same question here...

justanothernguyen commented 10 months ago

If you use the playground there should be an option called "width", the bigger this number the longer music is generated, default is 512. But at some point you'll be limited by VRAM, and the quality diminish.

Another way is to load up the model in a stable diffusion UI (A1111 or ComfyUI for example), generate the spectrogram, then outpaint to the right for longer file.

Here's one I did in ComfyUI, outpainting 4 times:

ComfyUI_temp_azmem_00028_

Finally use riffusion (either CLI or playground) to turn the spectrogram to audio.

If the spectrogram get too long, I think we can just split it in half, outpaint the right image, then stitch back the results.

gtbloody commented 2 months ago

If you use the playground there should be an option called "width", the bigger this number the longer music is generated, default is 512. But at some point you'll be limited by VRAM, and the quality diminish.

Another way is to load up the model in a stable diffusion UI (A1111 or ComfyUI for example), generate the spectrogram, then outpaint to the right for longer file.

Here's one I did in ComfyUI, outpainting 4 times:

ComfyUI_temp_azmem_00028_

Finally use riffusion (either CLI or playground) to turn the spectrogram to audio.

If the spectrogram get too long, I think we can just split it in half, outpaint the right image, then stitch back the results.

How to use in comfyui~