Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
Why ignore the cond_mask in condition fuser? The mask is just used for zero indexes where the input is None or padding but there should be attention values on these zeros.
Why ignore the cond_mask in condition fuser? The mask is just used for zero indexes where the input is None or padding but there should be attention values on these zeros.