beetbox / audioread

cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding for Python
MIT License
481 stars 108 forks source link

gstdec backend adds delay at the start when decoding a .mp3 #105

Open MZehren opened 4 years ago

MZehren commented 4 years ago

Hello,

When using the script decode.py to decode an mp3 into wav, the .wav file generated has 33ms of silence prepend at the start. Thus the original .mp3 and the decoded .wav are really not the same.

The output of the script is:

Input file: 2 channels at 44100 Hz; 452.7 seconds. Backend: gstdec

A ffprobe of the original .mp3 returns:

$ ffprobe Black\ Blood\ -\ Aiea\ Mwana\ (T.Kolai\ Special\ Edit).mp3 ffprobe version 3.4.6-0ubuntu0.18.04.1 Copyright (c) 2007-2019 the FFmpeg developers built with gcc 7 (Ubuntu 7t.3.0-16ubuntu3) configuration: --prefix=/usr --extra-version=0ubuntu0.18.04.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --enable-gpl --disable-stripping --enable-avresample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librubberband --enable-librsvg --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-omx --enable-openal --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libopencv --enable-libx264 --enable-shared libavutil 55. 78.100 / 55. 78.100 libavcodec 57.107.100 / 57.107.100 libavformat 57. 83.100 / 57. 83.100 libavdevice 57. 10.100 / 57. 10.100 libavfilter 6.107.100 / 6.107.100 libavresample 3. 7. 0 / 3. 7. 0 libswscale 4. 8.100 / 4. 8.100 libswresample 2. 9.100 / 2. 9.100 libpostproc 54. 7.100 / 54. 7.100 Input #0, mp3, from 'Black Blood - Aiea Mwana (T.Kolai Special Edit).mp3': Metadata: NITR : NTKB?&? title : Aiea Mwana artist : Black Blood comment : Afro-latino; xHD encoder : Lavf57.83.100 TKEY : 11m TBPM : 122 Duration: 00:07:32.68, start: 0.025057, bitrate: 128 kb/s Stream #0:0: Audio: mp3, 44100 Hz, stereo, s16p, 128 kb/s Metadata: encoder : Lavc57.10

I don't have the same issue when using Madmom library to read files which relies on the ffmpeg backend I believe. Where does this silence come from?

sampsyo commented 4 years ago

Huh, that's interesting! It would be worth digging into more deeply, but at the moment, I don't have an intuition for why GStreamer might be doing this. It might require some deeper GStreamer knowledge than I have to help explain it…

lazka commented 2 years ago

gstreamer doesn't support gapless decoding for mp3, which means there can be (very short) extra silence at the beginning and end of streams compared to other tools.