nypublicradio / audiogram

Turn audio into a shareable video.
MIT License
943 stars 335 forks source link

Audiograms include 81m11s long "Apple Text" component #89

Open iankevinmcdonald opened 7 years ago

iankevinmcdonald commented 7 years ago

After using Audiogram successfully for months to produce half-minute videos, it's increasingly often made them 81m11s long.

Interrogating the MP4 files, I see that the 81m files have these two components, and the files which are the correct length don't. You'll notice that some chapter marks that only make sense for my full podcast have been exported from my audio editor (Hindenburg) into the clip MP3 file, and then imported again by Audiogram; but the strange thing is that this is intermittent. At the end of this, I'll paste the full Mediainfo of two Audiograms created from the same MP3 file, one that works and one that doesn't.

Curiously, the "long" Audiograms are accepted by Facebook, but not by Instagram.

Text

Text
ID                               : 3
Format                           : Apple text
Codec ID                         : text
Duration                         : 1h 21mn
Bit rate mode                    : Variable
Bit rate                         : 0
Stream size                      : 113 Bytes (0%)
Language                         : English

Menu
00:01:07.316                     : USA vs Slavery
00:11:35.148                     : India
00:19:06.857                     : China
00:21:46.280                     : Europe
00:37:56.852                     : International

Version: Audiogram 0.9.5 Uname -a: Linux orwell 3.2.0-23-generic #36-Ubuntu SMP Tue Apr 10 20:41:14 UTC 2012 i686 i686 i386 GNU/Linux

Full Mediainfo Output for two Audiograms created from same clip & bg image.

Expected:

General
Complete name                    : C:\Users\Ian\Downloads\VegHist Sq v 01 - 2017-06-15 at 9.44pm - LDN.mp4
Format                           : MPEG-4
Format profile                   : Base Media
Codec ID                         : isom
File size                        : 3.50 MiB
Duration                         : 25s 225ms
Overall bit rate                 : 1 163 Kbps
Writing application              : Lavf57.60.100

Video
ID                               : 1
Format                           : AVC
Format/Info                      : Advanced Video Codec
Format profile                   : High@L3.1
Format settings, CABAC           : Yes
Format settings, ReFrames        : 4 frames
Codec ID                         : avc1
Codec ID/Info                    : Advanced Video Coding
Duration                         : 25s 200ms
Duration_LastFrame               : -12ms
Bit rate mode                    : Variable
Bit rate                         : 1 029 Kbps
Width                            : 800 pixels
Height                           : 800 pixels
Display aspect ratio             : 1.000
Frame rate mode                  : Constant
Frame rate                       : 20.000 fps
Color space                      : YUV
Chroma subsampling               : 4:2:0
Bit depth                        : 8 bits
Scan type                        : Progressive
Bits/(Pixel*Frame)               : 0.080
Stream size                      : 3.09 MiB (88%)
Writing library                  : x264 core 120 r2151 a3f4407
Encoding settings                : cabac=1 / ref=3 / deblock=1:0:0 / analyse=0x3:0x113 / me=hex / subme=7 / psy=1 / psy_rd=1.00:0.00 / mixed_ref=1 / me_range=16 / chroma_me=1 / trellis=1 / 8x8dct=1 / cqm=0 / deadzone=21,11 / fast_pskip=1 / chroma_qp_offset=-2 / threads=3 / sliced_threads=0 / nr=0 / decimate=1 / interlaced=0 / bluray_compat=0 / constrained_intra=0 / bframes=3 / b_pyramid=2 / b_adapt=1 / b_bias=0 / direct=1 / weightb=1 / open_gop=0 / weightp=2 / keyint=250 / keyint_min=20 / scenecut=40 / intra_refresh=0 / rc_lookahead=40 / rc=crf / mbtree=1 / crf=23.0 / qcomp=0.60 / qpmin=0 / qpmax=69 / qpstep=4 / ip_ratio=1.40 / aq=1:1.00

Audio
ID                               : 2
Format                           : AAC
Format/Info                      : Advanced Audio Codec
Format profile                   : LC

Buggy:

General
Complete name                    : C:\Users\Ian\Downloads\VegHist Sq v 01 - EP13 map - 2017-05-19 at 11.07am.mp4
Format                           : MPEG-4
Format profile                   : Base Media
Codec ID                         : isom
File size                        : 3.42 MiB
Duration                         : 1h 21mn
Overall bit rate                 : 5 892 bps
Writing application              : Lavf57.60.100

Video
ID                               : 1
Format                           : AVC
Format/Info                      : Advanced Video Codec
Format profile                   : High@L3.1
Format settings, CABAC           : Yes
Format settings, ReFrames        : 4 frames
Codec ID                         : avc1
Codec ID/Info                    : Advanced Video Coding
Duration                         : 25s 650ms
Bit rate mode                    : Variable
Bit rate                         : 985 Kbps
Width                            : 800 pixels
Height                           : 800 pixels
Display aspect ratio             : 1.000
Frame rate mode                  : Constant
Frame rate                       : 20.000 fps
Color space                      : YUV
Chroma subsampling               : 4:2:0
Bit depth                        : 8 bits
Scan type                        : Progressive
Bits/(Pixel*Frame)               : 0.077
Stream size                      : 3.01 MiB (88%)
Writing library                  : x264 core 120 r2151 a3f4407
Encoding settings                : cabac=1 / ref=3 / deblock=1:0:0 / analyse=0x3:0x113 / me=hex / subme=7 / psy=1 / psy_rd=1.00:0.00 / mixed_ref=1 / me_range=16 / chroma_me=1 / trellis=1 / 8x8dct=1 / cqm=0 / deadzone=21,11 / fast_pskip=1 / chroma_qp_offset=-2 / threads=3 / sliced_threads=0 / nr=0 / decimate=1 / interlaced=0 / bluray_compat=0 / constrained_intra=0 / bframes=3 / b_pyramid=2 / b_adapt=1 / b_bias=0 / direct=1 / weightb=1 / open_gop=0 / weightp=2 / keyint=250 / keyint_min=20 / scenecut=40 / intra_refresh=0 / rc_lookahead=40 / rc=crf / mbtree=1 / crf=23.0 / qcomp=0.60 / qpmin=0 / qpmax=69 / qpstep=4 / ip_ratio=1.40 / aq=1:1.00

Audio
ID                               : 2
Format                           : AAC
Format/Info                      : Advanced Audio Codec
Format profile                   : LC
Codec ID                         : 40
Duration                         : 25s 685ms
Bit rate mode                    : Constant
Bit rate                         : 127 Kbps
Channel(s)                       : 2 channels
Channel positions                : Front: L R
Sampling rate                    : 48.0 KHz
Compression mode                 : Lossy
Stream size                      : 400 KiB (11%)

Text
ID                               : 3
Format                           : Apple text
Codec ID                         : text
Duration                         : 1h 21mn
Bit rate mode                    : Variable
Bit rate                         : 0
Stream size                      : 113 Bytes (0%)
Language                         : English

Menu
00:01:07.316                     : USA vs Slavery
00:11:35.148                     : India
00:19:06.857                     : China
00:21:46.280                     : Europe
00:37:56.852                     : International
veltman commented 7 years ago

Hmm, I'm not sure where to even begin debugging this. It certainly seems like the Hindenburg export settings would be the culprit. Does this ever occur with an MP3 exported from a different source?

In terms of it being "intermittent," are you sure the conditions are identical? Can you take the same file and run it twice and get different results? Or is two different files exported from Hindenburg? My best guess would just be that whether your export includes that metadata depends on exactly how you export it or what time range you have selected.

Can you attach one of the mp3 files that produces the bad result?

This may be related:

https://trac.ffmpeg.org/ticket/476

iankevinmcdonald commented 7 years ago

I'll test a different source (and may simply re-export the same MP3 from Audacity).

The conditions were as identical as I could make it - those two MediaInfo outputs were from the same audio file and used the same pic. Here is the MP3 file I used for them:

ep13clip_tourStart.zip

iankevinmcdonald commented 7 years ago

I tried with a different MP3 file (again, from Hindenburg, but taken from an edit which didn't ISTR use chapter headings), and that gave me a correct 28s length Audiogram. Also, putting the offending audio files through Audacity (which strips the extraneous information whilst also decoding & recoding it) seems to fix it too.

And I can predictably reproduce the bug with the offending MP3 now. So I have a workaround, at least.