sandreas / m4b-tool

m4b-tool is a command line utility to merge, split and chapterize audiobook files such as mp3, ogg, flac, m4a or m4b
MIT License
1.18k stars 76 forks source link

Add support for chapter thumbnails #21

Closed mperkh closed 5 years ago

mperkh commented 5 years ago

I think this is not possible now with m4b-tool, but would be a nice feature, if that can be integrated. Guess it can be done using FFmpeg or mp4v2. Tested on VLC (iOS, Android - VLC on Android only offers chapter navigation, if such a thumb track is available, I guess because then it thinks it is a video and handles the file differently) and iTunes/iOS books app - these apps can display these thumb images.

Proposal:

  1. If image is embedded into source audio file, it should be a thumbnail included into the chapter thumb track.
  2. If no image is embedded into source file, a search for filename.[jpg, png]* should be done, an if found, that image should be used as chapter thumbnail.
  3. If creation of chapter thumbnails is activated and no image for a track is found via 1. or 2., a default image or black image should be used. (not sure about this, perhaps creating this slideshow track via FFmpeg offers ability to specify a time range, when an image is to be shown)
  4. Since most players expect these thumb and cover images to be square, it should be possible to crop them or fit them into an empty image with given background color.

Perhaps it can be done like this: https://trac.ffmpeg.org/wiki/Slideshow (edit: this seems to create just a h264 video track, but not a JPEG track) Somehow it seems to be necessary to tell the mp4 tracks/container that this is a menu track for the main track. Didn't find any info on that topic yet and was not able to produce something useful via FFmpeg by hand.

mediainfo output of the chapter thumbnail track (created via macOS app Audiobook Builder):

Video
ID                                       : 2
Format                                   : JPEG
Codec ID                                 : jpeg
Duration                                 : 7 h 15 min
Bit rate mode                            : Variable
Bit rate                                 : 204 b/s
Width                                    : 512 pixels
Height                                   : 512 pixels
Display aspect ratio                     : 1.000
Frame rate mode                          : Variable
Frame rate                               : 0.001 FPS
Minimum frame rate                       : 0.001 FPS
Maximum frame rate                       : 0.005 FPS
Color space                              : YUV
Chroma subsampling                       : 4:2:0
Bit depth                                : 8 bits
Compression mode                         : Lossy
Bits/(Pixel*Frame)                       : 0.778
Stream size                              : 822 KiB (1%)
Title                                    : Apple Video Mediensteuerung / Apple Alias-Datensteuerung
Language                                 : English
Encoded date                             : UTC 2018-11-19 20:30:36
Tagged date                              : UTC 2018-11-19 20:30:36
Menu For                                 : 1

mp4info output:

2   video   jpeg, 26132.996 secs, 0 kbps, 512x512 @ 0.001263 fps

ffmpeg -i output (Stream #0:1 seems to be the chapter thumb track. The others seem to be metainfo text and the cover image):

    Metadata:
      creation_time   : 2018-11-19T20:30:03.000000Z
      handler_name    : Apple Ton Mediensteuerung
    Stream #0:1(eng): Video: mjpeg (jpeg / 0x6765706A), yuvj420p(pc, bt470bg/unknown/unknown), 512x512 [SAR 72:72 DAR 1:1], 0 kb/s, 0.0013 fps, 600 tbr, 600 tbn, 600 tbc (default)
    Metadata:
      creation_time   : 2018-11-19T20:30:36.000000Z
      handler_name    : Apple Video Mediensteuerung
      encoder         : Foto - JPEG
    Stream #0:2(eng): Data: bin_data (text / 0x74786574)
    Metadata:
      creation_time   : 2018-11-19T20:30:36.000000Z
      handler_name    : Apple Text-Mediensteuerung
    Stream #0:3: Video: mjpeg, yuvj420p(pc, bt470bg/unknown/unknown), 1022x1022 [SAR 144:144 DAR 1:1], 90k tbr, 90k tbn, 90k tbc
sandreas commented 5 years ago

Could be done via mp4art, since it is possible to add more than one image. I'll take a look at this, but this could take a while, since I've got a big project going on, which you might like more ;)

andreas:homebrew-tap andreas$ mp4art -h
Usage: mp4art [OPTION]... ACTION file...

For each mp4 (m4a) file specified, perform the specified ACTION. An action
must be specified. Some options are not applicable for some actions.

ACTIONS
     --list           list all covr-boxes
     --add IMG        add covr-box from IMG file
     --replace IMG    replace covr-box with IMG file
     --remove         remove covr-box
     --extract        extract covr-box

ACTION PARAMETERS
     --art-any        act on all covr-boxes (default)
     --art-index IDX  act on covr-box index IDX

OPTIONS
 -z, --optimize       optimize mp4 file after modification
 -y, --dryrun         do not actually create or modify any files
 -k, --keepgoing      continue batch processing even after errors
 -o, --overwrite      overwrite existing files when creating
 -f, --force          force overwrite even if file is read-only
 -q, --quiet          equivalent to --verbose 0
 -d, --debug NUM      increase debug or long-option to set NUM
 -v, --verbose NUM    increase verbosity or long-option to set NUM
 -h, --help           print brief help or long-option for extended help
     --version        print version information and exit
mperkh commented 5 years ago

Thanks for the reply. Tried to find more info on that topic. It's not possible to do that with mp4art.

This seems to be a feature, introduced by Apple, called https://en.wikipedia.org/wiki/Enhanced_podcast.

Was able to create a mjpeg track using this FFmpeg command:

ffmpeg -f concat -i files.txt -vcodec mjpeg -f mov -q:v 15 -huffman optimal -r 0.5 output.mp4

with files.txt

file '001.jpg'
duration 5
file '002.jpg'
duration 5
file '003.jpg'
duration 5
file '003.jpg'

The above command creates an mp4 file, with a jpeg track/codec id. But how and with which tool this track can be muxed into the final m4b, with appropriate menu and menu for tags set, is unclear by now.

sandreas commented 5 years ago

Well, on trying to merge e.g. files 01.mp3, 02.mp3, 03.mp3 this could be possible:

merge following list:

01-cover.m4b
01.m4b
02-cover.m4b
02.m4b
03-cover.m4b
03.m4b

But i'm not sure what ffmpeg produces here. Besides a combination of fdkaac and chapter covers could be difficult to handle.

sandreas commented 5 years ago

I'm really sorry to close this one unresolved, but since it has been open for a very long time and i did not find a possibility to solve this, it makes no sense to keep this open all the time. If there is some new information, it will be reopened.