descriptinc / descript-audio-codec

State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.
https://descript.notion.site/Descript-Audio-Codec-11389fce0ce2419891d6591a68f814d5
MIT License
1.12k stars 101 forks source link

byte count on 16kHz decoding #37

Open lonce opened 1 year ago

lonce commented 1 year ago

Hi, I am getting an error on decoding when I use "16khz". For my two second files, the original length is 3200 byres, but the reconstruction comes up 8 bytes short:

File "/home/lonce/working/descript-audio-codec/dac/model/base.py", line 289, in decompress recons.audio_data = recons.audio_data.reshape( RuntimeError: shape '[-1, 1, 32000]' is invalid for input of size 31992

I can "fix" the error by just hard-coding the the length argument to the reshape operation (on line 289 in body.py) to 3192. For the general fix, I suppose the reshape should be given the length of the recon signal, not the original signal. Or else the reconstruction should produce the exact same number of bytes as the original files.

This is using code pulled from github on 2023.08.17.

p.s. Nice work on the codec - it sounds great, the compression is amazing, and the git docs easy to understand!

lonce commented 1 year ago

Screen-2023-08-21_15-08-19

Just a bit more information to illustrate the issue. You can see that the original signal is 32000 samples, but after encoding/decoding, the length is 31992.

Thanks again.

lonce commented 1 year ago

I just noticed this only happens with model.encode/decode (and with dac encode / dac decode), not with model.compress/decompress.

barneymaydance commented 11 months ago

I have encountered a similar issue when working with a 5-second audio track at 24KHz. It appears that there is consistently an 8-sample loss. It leads me to suspect that the decoder in the model may not be adequately padded.

Stanwang1210 commented 10 months ago

I encounter the same issue. Also an 8-sample loss for 1 second audio. Does anyone solve this issue without hard-coding ?

heilrahc commented 1 week ago

Any update? It's stupid that after I compress the audio, using the same model, I can't decompress it because of the length input mismatch error. @pseeth @eeishaan