Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
MIT License
20.92k
stars
2.14k
forks
source link
[research question] does the model predict the last few codes in delayed pattern #95
Thanks for the great work and open sourcing everything!
On the delayed pattern
it seems that there should be a few model tokens at the end to be predicted (I drew some of them out on the left). Are those predicted? or ignoring those doesn't affect result (which is reasonable considering the first codebook is the most important and only a very small amount of codes from higher codebooks are dropped)
Hi,
Thanks for the great work and open sourcing everything!
On the delayed pattern
it seems that there should be a few model tokens at the end to be predicted (I drew some of them out on the left). Are those predicted? or ignoring those doesn't affect result (which is reasonable considering the first codebook is the most important and only a very small amount of codes from higher codebooks are dropped)
Also, Looking forward to the training code!
Thanks