slmdev / sac

state-of-the-art lossless audio compression
44 stars 4 forks source link

Segfault when decoding a sac file #1

Closed T-3B closed 1 year ago

T-3B commented 2 years ago

Hi ! I'm really interested in SAC compression, and wanted to do a benchmark.

Unfortunately, one file (over 40) produces a segfault when trying to decode it.

Here is the command used to compress/encode:

$ sac --encode --high --optimize=high --sparse-pcm "Through The Fire And Flames.wav" "Through The Fire And Flames.sac
Sac v0.6.2 - Lossless Audio Coder (c) Sebastian Lehmann
compiled on Aug 11 2022 (64-bit) with gcc 10.2.1

Open: 'Through The Fire And Flames.wav': ok (5255158 Bytes)
  WAVE  Codec: PCM (1411 kbps)
  44100Hz 16 Bit  Stereo
  1313705 Samples [00:00:29.789]
Create: 'Through The Fire And Flames.sac': ok
  Profile: High (optimize high)
dds: 500
[0.997616 0.264084 0.0407819 0.624542 0.0832928 0.342506 0.999445 0.824473 0.591132 0.771204 0.00365054 0.935802 0.999365 0.852348 0.0945611 0.423981 0.707027 0.317633 0.998731 0.984577 0.3187 0.612117 0.00260699 0.858187 9.63344 20.7874 23.9209 6.23615 959.144 227.834 25.6977 2004.82 213.606 19.6016 0.0945232 0.787564 7.26226 22.378 6.42737 0.985907 0.985537 0.762197 0.662538 ]
mapsize: 2932 Bytes
block: normal
mapsize: 2876 Bytes
block: normal
dds: 500/1313705:  26.9%
[0.998612 0.179034 0.0631693 0.676037 0.0832928 0.144737 0.999504 0.404512 0.514961 0.49819 0.00390351 0.917745 0.999811 1.86357 0.103148 0.373901 0.790674 0.0471979 0.998485 0.756759 0.433787 0.412985 0.00342562 0.834611 9.08007 20.7874 16.7946 8.02478 981.81 218.589 26.4926 1851.08 226.902 19.6016 0.0635306 0.462137 5.62966 19.0308 6.68095 0.985033 0.980185 0.645448 0.826904 ]
mapsize: 2835 Bytes
block: normal
mapsize: 2810 Bytes
block: normal
dds: 500/1313705:  53.7%
[0.999444 1.09331 0.214619 0.594424 0.0163536 0.0653821 0.9983 0.756148 0.524155 0.411697 0.00593637 0.894272 0.998427 1.29892 0.0918146 0.536689 0.313218 0.0400396 0.999809 0.8668 0.883432 0.0289241 0.0049912 0.837639 10.8691 16.1719 14.0589 6.48535 1214.86 236.333 28.3631 1636.25 229.473 22.019 0.346125 1.06652 2.15272 7.85379 4.14267 0.985434 0.980139 0.575677 0.657806 ]
mapsize: 2794 Bytes
block: normal
mapsize: 2805 Bytes
block: normal
dds: 5000/1313705:  80.6%
[0.998623 1.15778 0.170575 0.683924 0.00858799 0.0227322 0.998635 0.743229 0.263777 0.60374 0.00354549 0.871539 0.997216 0.653776 0.284617 0.781993 0.371681 0.175186 0.997031 0.965444 0.897423 0.122481 0.00378013 0.820077 9.66455 9.50128 13.5278 4.20717 1747.78 247.405 26.8678 1512.42 234.837 18.3872 0.119361 0.339624 0.663289 16.3307 6.64132 0.988029 0.981501 0.846586 0.968596 ]
mapsize: 3058 Bytes
block: normal
mapsize: 3059 Bytes
block: normal
  Timing: pred 99.41%, enc 0.59%, misc 0.00%

  5255158->3922623=74.6% (11.944 bps)

Total time: [01:49:09]

When trying to get back the wav file (decoding SAC file), I get:

$ ~/Programmes/sac --decode "Through The Fire And Flames.sac" "Through The Fire And Flames_COPY.wav"
Sac v0.6.2 - Lossless Audio Coder (c) Sebastian Lehmann
compiled on Aug 11 2022 (64-bit) with gcc 10.2.1

Open: 'Through The Fire And Flames.sac': ok (3922623 Bytes)
  WAVE  Codec: PCM (1053 kbps)
  44100Hz 16 Bit  Stereo
  1313705 Samples [00:00:29.789]
  Profile: High
  Ratio:   11.944 bits per sample

Create: 'Through The Fire And Flames_COPY.wav': ok
Erreur de segmentation

Where "Erreur de segmentation" means "segmentation fault".

You can find both original and encoded file here: https://mega.nz/folder/xnBx2KqY#0N2zN6bjQ6A6IoIB5qe2Bg

Feel free to ask other encodes of that wav file (I mean different commandline args). Hope it will be easy to debug.

MartinEesmaa commented 1 year ago

Hello, @T-3B. I tested your SAC encoded file, and it gives me a segmentation error. In an original audio file, the metadata contains:

General
Complete name                            : Through The Fire and Flames.wav
Format                                   : Wave
File size                                : 5.01 MiB
Duration                                 : 29 s 789 ms
Overall bit rate mode                    : Constant
Overall bit rate                         : 1 411 kb/s
Album                                    : Inhuman Rampage
Track name                               : Through The Fire And Flames
Performer                                : DragonForce
Director                                 : DragonForce
Original source form/Name                : Inhuman Rampage
S                                        : clip // Could result error decode from SAC audio file

Audio
Format                                   : PCM
Format settings                          : Little / Signed
Codec ID                                 : 1
Duration                                 : 29 s 789 ms
Bit rate mode                            : Constant
Bit rate                                 : 1 411.2 kb/s
Channel(s)                               : 2 channels
Sampling rate                            : 44.1 kHz
Bit depth                                : 16 bits
Stream size                              : 5.01 MiB (100%)

One problem, that SAC encoder may not support unknown/VLC customized metadata (example S clip), could result in error before decode. The solution is to remove unknown metadata by using FFmpeg or fre:ac or you could edit WAV audio file using your audio tag software. In FFmpeg, it may lose some metadata after removing unknown metadata:

ffmpeg -i youraudio.wav -fflags +bitexact -c:a pcm_s16le audio_clean.wav

Remove all metadata code:

ffmpeg -i youraudio.wav -c:a pcm_s16le -fflags +bitexact -flags:v +bitexact -flags:a +bitexact -map_metadata -1 nometadata.wav

Please note, using VLC metadata modifier of WAV file could be result error for SAC audio file. Use FFmpeg to add correctly metadata to make SAC audio decodable:

ffmpeg -i youraudiofile.wav -c:a pcm_s16le \
-fflags +bitexact -flags:v +bitexact -flags:a +bitexact \
-map_metadata -1 \
-metadata title="Through The Fire And Flames" \
addedmetadata.wav

Accepted metadata in FFmpeg: title, album, artist, track, genre, language, copyright, date & comment

If you have any questions or problems, please reply to me.

Thank you! :)

T-3B commented 1 year ago

I'm closing this, thank you for your reply :smile: Ideally the program should warn the user (or something) before encoding the file.

I wish this project was not dead, but there are other problems (as you said on Discord - i.e. most of the decoded files do not match the original ones).

Probably when I'll have more time (but not soon), I'll try to make the "codec" really lossless - which seems quite difficult to do :sweat: :laughing: .

I already read most of the files, and I discovered that --optimize=insane is allowed (better compression but far far slower).

MartinEesmaa commented 1 year ago

You're welcome, @T-3B! I know it seems difficult to make your own lossless codec, but it requires research, prototype and advanced programming skills to make lossless audio codec.

Some audio files of SAC have incorrect/error bits after encode, which makes different MD5 after decoding may not match the original WAV MD5. Must be always check if it's an original WAV MD5 matched as from SAC decoded to original WAV MD5.

I agree, the optimization with insane makes very slow encode, which takes about couple days.

Have a good day!

slmdev commented 1 year ago

@MartinEesmaa @T-3B Decoded wave-files should be bit-per-bit identical including all meta data, so no need to use ffmpeg, just do a direct binary file compare.

I looked into the above file and the id3 tag is detected (see screenshot), but there seems to be some error on extraction. I'll have to investigate this.

image

T-3B commented 1 year ago

Thank you !

I was just wondering: is the metadata also compressed in the SAC file ?

slmdev commented 1 year ago

metadata is not compressed but copied into the SAC file and restored upon decompression. you can check the Chunks class in https://github.com/slmdev/sac/blob/master/src/file/wav.h

slmdev commented 1 year ago

Metadata extraction fails when the chunk-size is odd.

Line 41 in https://github.com/slmdev/sac/blob/master/src/file/wav.cpp

Fix is already applied locally, will push later today.

slmdev commented 1 year ago

https://github.com/slmdev/sac/commit/32cc872a0508a78baf9d7bbcb1852771b8d9e5f5