descriptinc / descript-audio-codec

State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.
https://descript.notion.site/Descript-Audio-Codec-11389fce0ce2419891d6591a68f814d5
MIT License
1.21k stars 114 forks source link

Seeking Guidance on Integrating DAC into LLM Training #77

Open lixucuhk opened 4 months ago

lixucuhk commented 4 months ago

I have been trying to integrate the DAC codes into LLM training. However, I encountered challenges in achieving satisfactory predictions with LLMs, such as VALLE. Has anyone, including the authors, successfully accomplished this? I would greatly appreciate any suggestions or guidance.

DBraun commented 4 months ago

DAC was used in VampNet and also as a baseline compared to EnCodec in MusicGen. I haven't seen an open-source TTS paper use DAC.

lixucuhk commented 4 months ago

DAC was used in VampNet and also as a baseline compared to EnCodec in MusicGen. I haven't seen an open-source TTS paper use DAC.

Many thanks Braun!