facebookresearch / audiocraft

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
MIT License
20.5k stars 2.06k forks source link

[REQ] intermediate results #252

Open 0xlws opened 1 year ago

0xlws commented 1 year ago

@JadeCopet @adefossez a while ago i played around with trying to get intermediate results using yield from inside the lm.generate() and then yield from in the above functions (works).

it resulted in CUDA errors, i think because it generating the tokens and me also trying to make it turn tokens into audio tokens for preview. is there a way to prevent these errors or point me to the right direction for this, its possible to return the intermediate tokens as they are generated, just not process them while the loop is still going i guess.

in short i cant process the tokens for preview (compression model.generate_audio) while its also in the while loop getting the next token, looking for a workaround/solution. any advice?

adefossez commented 12 months ago

Strange, I don't see why this would result in a CUDA error, can you provide more information on exactly what is going on and what error you are getting ?

0xlws commented 11 months ago

thanks for your support! i tried again yesterday and was able to solve the issue, i did not encounter errors such as 'device side assertion (cuda)' anymore.

model.set_generation_params( use_sampling=True, top_k=250, duration=3 )

l=[] final_values = None for result in model.generate( descriptions=[ 'drum and bass beat with intense percussions' ], progress=True, return_tokens=True, frames_interval=50 ): if type(result) == tuple: final_values = result else: display_audio(result, 32000)

<img width="309" alt="Screenshot 2023-09-01 at 07 41 39" src="https://github.com/facebookresearch/audiocraft/assets/87901794/d7e354a6-0d09-4ab0-871f-8de52edf28e0">

- return values as usual (no intermediate results):
```python
from audiocraft.utils.notebook import display_audio

model.set_generation_params(
    use_sampling=True,
    top_k=250,
    duration=2
)

output = model.generate(
    descriptions=[
        'drum and bass beat with intense percussions'
    ],
    progress=True, return_tokens=True
)

try:
    output = next(output)
except StopIteration as e:
    output = e.value

display_audio(output[0], sample_rate=32000)
if USE_DIFFUSION_DECODER:
    out_diffusion = mbd.tokens_to_wav(output[1])
    display_audio(out_diffusion, sample_rate=32000)

i was going to try and wrap it so it would behave as usual, abstracting away the generator logic


edit: added musicgen.py code snippet edit2: noticed i need to correct musicgen.py return values under yield condition [decoded_tokens, tokens]