GrandaddyShmax / audiocraft_plus

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
MIT License
561 stars 63 forks source link

Longer durations cause error #16

Closed Drommer-Kille closed 1 year ago

Drommer-Kille commented 1 year ago

Up to 60sec this works fine, but 120sec cause error. Original audiocraft generates 120sec files just fine so it's not hw related. GPU is RTX3090 with 24gb VRAM.

This error after it gets to the 1500 steps: TypeError: MusicGen.generate_continuation() got an unexpected keyword argument 'melody_wavs'

GrandaddyShmax commented 1 year ago

That is strange, Using the colab with T4 GPU with 16GB I managed to generate up to 200sec Can you check if you are hitting the cap when generating 120sec? how much VRAM is used?

Drommer-Kille commented 1 year ago

VRAM is 14/24gb during and jumps to 15,7/24gb when the error comes. So it's not running out of memory. launch.py", line 156, in predict next_segment = MODEL.generate_continuation(last_chunk, TypeError: MusicGen.generate_continuation() got an unexpected keyword argument 'melody_wavs'

The bigger problem is that, this is not the same model as the original Audiocraft. At least it produces totally different results. This does not seem to follow the prompt at all.

That said, my gradio gui looks different than the plus screenshot. I have seed etc, but not the several prompt boxes.

GrandaddyShmax commented 1 year ago

Can you screenshot the gui that you currently have?

Drommer-Kille commented 1 year ago

Here Screenshot (65)

GrandaddyShmax commented 1 year ago

It seems like you somehow cloned a very early version of this repo, try to do a git pull

Drommer-Kille commented 1 year ago

Git pull: Already up to date. I followed this:

# Best to make sure you have torch installed first, in particular before installing xformers.
# Don't run this if you already have PyTorch installed.
pip install 'torch>=2.0'
# Then proceed to one of the following
pip install -U audiocraft  # stable release
pip install -U git+https://git@github.com/GrandaddyShmax/audiocraft_plus#egg=audiocraft
pip install -e .  # or if you cloned the repo locally

Git cloned this: https://github.com/GrandaddyShmax/audiocraft_plus

GrandaddyShmax commented 1 year ago

Are you perhaps running the launch.py?

Drommer-Kille commented 1 year ago

User error, sorry for wasting time :( Using app.py it launches the correct version and i have generated 240sec audiofile without any problem. Since it can batch prompts it would also be nice to be able to run single prompt multiple times to separate files without overlap. Or all prompts to separate files without overlap.

Huge thanks for creating this!

GrandaddyShmax commented 1 year ago

Glad you got it working 😁 I need to remove that launch.py, forgot to remove it. Queued generation is planned, cannot say yet when it will be ready

Drommer-Kille commented 1 year ago

Any idea why it did that? I'm asking as i'm now testing the new release of Audiocraft with MultiBand Diffusion and i get the same error if trying to generate longer files than 60sec. I can generate 5min clips with audiocraft+ but the official repo gives cuda memory error for 120sec files. Any plans on supporting MultiBand Diffusion on + version?

Not that it's much better: Dataset is still same 16khz mono and MBD tries to restore some high frequencies while adding weird artifacts to the sound. Not a big improvement. But the training sound interesting if that could be done with 44.1kHz files.

GrandaddyShmax commented 1 year ago

working on implementing the new update, from my testing so far the multiband requires much more VRAM.

Drommer-Kille commented 1 year ago

That seems to be the case, With MB disabled it generates 120sec clips. Any plans on integrating the training code to +?

GrandaddyShmax commented 1 year ago

I'll try, cannot promise though of success