facebookresearch / audiocraft

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
MIT License
20.82k stars 2.13k forks source link

Audio file conditioning to continue (sliding window) #8

Open DanielShemesh opened 1 year ago

DanielShemesh commented 1 year ago

Consider implementing or providing an option to condition a model on a specific audio file, enabling the generation of audio that continues the input audio.

I read it's possible using a sliding window, but I would like to see example usage code for this in the Jupyter notebook.

Thank you guys so much!

ghost commented 1 year ago

This! As a sort of 'outpainting' feature

FurkanGozukara commented 1 year ago

Yes we need this

ghost commented 1 year ago

If you look into the demo notebook you can find the function model.generate_continuation() for music continuation and model.generate_with_chroma() for audio+text generation

FurkanGozukara commented 1 year ago

If you look into the demo notebook you can find the function model.generate_continuation() for music continuation and model.generate_with_chroma() for audio+text generation

why this is not available in gradio app any ideas?

so with this way we can feed previously generated song to continue generating more theoretically ?

Gushousekai195 commented 1 year ago

This is expected to be implemented!

No ifs, ands, or buts!

30 seconds is NOT enough!

ghost commented 1 year ago

If you look into the demo notebook you can find the function model.generate_continuation() for music continuation and model.generate_with_chroma() for audio+text generation

why this is not available in gradio app any ideas?

so with this way we can feed previously generated song to continue generating more theoretically ?

Yes you can continue a song. Expect deterioration over a long period of time.

My guess is that the gradio interface is a minimal demonstration. You can still load the notebook into something like Colab and run those functions

FurkanGozukara commented 1 year ago

If you look into the demo notebook you can find the function model.generate_continuation() for music continuation and model.generate_with_chroma() for audio+text generation

why this is not available in gradio app any ideas? so with this way we can feed previously generated song to continue generating more theoretically ?

Yes you can continue a song. Expect deterioration over a long period of time.

My guess is that the gradio interface is a minimal demonstration. You can still load the notebook into something like Colab and run those functions

i hope someone adds that feature to the gradio so people can use it directly

rebotnix commented 1 year ago

The Generate continuation works, but the output is then limit again to 30 seconds. It seems that you have to mix the output waveform by yourself?

Here is what i tried: 1) Generated a 15 mix of an simple edm song, works with prompt "simple edm song" - just for testing.

2) add the description for the continuation as in the first one desc = ["edm light with piano mix bpm 126"]

3) load the first generated wav as an input prompt wav: prompt_waveform, prompt_sr = torchaudio.load("production/test/first_edm1.wav") prompt_duration = 15 prompt_waveform = prompt_waveform[..., :int(prompt_duration * prompt_sr)]

4) Mix the loaded first wave and generated with the desc prompt the new one output = model.generate_continuation(prompt_waveform,32000,desc2,progress=True)

5) save the output to a wavefile.

So far so good, i can play the wav file, but the output of the new waveform is still max 30 seconds long with an sliding window of the first loaded wav file. From my understanding, it should be at least 45 seconds long. First song 15 seconds + 30 song the new created one are equal to 45 seconds total. But the result is a always maximum of 30 seconds each time. When you set the model params to duration=45, it will not work of course.

Since musicgen is not yet bar or loop safe and you can't just add the first wave with the new one, I think the original author did the sliding window on his demo page differently to create the 2nd minute files.

Do i over see something?

@marianbastiUNRN @FurkanGozukara @DanielShemesh

rkfg commented 1 year ago

Working code here: https://github.com/facebookresearch/audiocraft/issues/36#issuecomment-1586236702

rebotnix commented 1 year ago

wow..cool, thanks for sharing.

robertJene commented 1 year ago

If you could set a static BPM, timing signature, and key, then also set the number of measures to generate (instead of seconds), you could then import into a music editor and mix. Would be even better if the generated output was in "stems"

Gushousekai195 commented 1 year ago

If you could set a static BPM, timing signature, and key, then also set the number of measures to generate (instead of seconds), you could then import into a music editor and mix. Would be even better if the generated output was in "stems"

This ⤴️

tonymacx86PRO commented 1 year ago

Maybe i will make my own WebUI and CLI with all the features and additional implementations as the long duration music ( maybe ) W.I.P

wandrzej commented 1 year ago

I know the other issue was closed, but wanted to ask about an option to alter the prompt on the sliding window? So the music would transition from one prompt to another

GrandaddyShmax commented 1 year ago

I know the other issue was closed, but wanted to ask about an option to alter the prompt on the sliding window? So the music would transition from one prompt to another

imsdfgsdfgge

Here you go, have fun: https://github.com/GrandaddyShmax/audiocraft_plus/tree/plus in the gradio interface you'll find the explanation on how to use each feature

there is also a huggingface version but you'll need to clone the space and use GPU: https://huggingface.co/spaces/GrandaddyShmax/MusicGen_Plus

as well there is a colab: https://colab.research.google.com/github/camenduru/MusicGen-colab/blob/main/MusicGen_ClownOfMadness_plus_colab.ipynb