facebookresearch / audiocraft

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
MIT License
20.5k stars 2.06k forks source link

Progressbar values are incorrect #242

Closed bartekleon closed 12 months ago

bartekleon commented 1 year ago

The current token (amount of generated tokens) and total amount of tokens do not match up. (I got 1003/1000 tokens for 20s audio)

1000 makes sense as it's duration (20) * frame_rate (50)

Somewhere in audiocraft.models.lm generate function is some issue in counting

0xlws commented 1 year ago

hey @bartekleon, i encountered this as well.

https://github.com/facebookresearch/audiocraft/blob/e96018613ac82b1afe0f0cce7861dfe08ba2b3bf/audiocraft/models/lm.py#L484

gen_sequence_len in my case was always 4 tokens more then needed for generation, i think its the padding tokens being counted. it gets trimmed away in this line:

https://github.com/facebookresearch/audiocraft/blob/e96018613ac82b1afe0f0cce7861dfe08ba2b3bf/audiocraft/models/lm.py#L527

adefossez commented 1 year ago

This is because of the delayed pattern, see our paper for detailed. In my opinion this is not really worth fixing, and would require complex interactions between the model and the API code. In the demo we now just cap that number.