leandromoreira / ffmpeg-libav-tutorial

FFmpeg libav tutorial - learn how media works from basic to transmuxing, transcoding and more. Translations: πŸ‡ΊπŸ‡Έ πŸ‡¨πŸ‡³ πŸ‡°πŸ‡· πŸ‡ͺπŸ‡Έ πŸ‡»πŸ‡³ πŸ‡§πŸ‡·
https://github.com/leandromoreira/ffmpeg-libav-tutorial
BSD 3-Clause "New" or "Revised" License
9.94k stars 956 forks source link

[WIP] Add the transcoding chapter #52

Closed leandromoreira closed 4 years ago

leandromoreira commented 4 years ago

tasks/issues to be done

The chapter will demonstrate how to transcoding using libav/ffmpeg. Things we need to be done/fixed before:

long explanation

While I'm doing this chapter I'm also trying to understand some issues I found although the main examples seems to be okay, aka the vlc is playing it fine:

# if you want to run them in your computer: make sure you have docker and
# have ran the make fetch_small_bunny_video command to download the bunny video

make run_transcoding_gop

make run_transcoding_h265

But anyway here are my doubts:

leandromoreira commented 4 years ago

Vassilis helped me through the mailling list to fix the wrong time / frame rate transcoding problem (ffprobe was shoing 60.11), you must set explicitly the frame->duration.

leandromoreira commented 4 years ago

In order to solve the MB rate (125337600) > level limit (2073600) message I'm trying to explicitly configure the rate control as suggested by the great people at doom9 but no success until now.

+  av_opt_set(encoder_context->codec_context[index]->priv_data, "x264opts", "vbv-maxrate=3000000:vbv-bufsize=3000000:keyint=60:min-keyint=60:scenecut=-1", 0);
+
+  encoder_codec_context->rc_min_rate = 3000000;
+  encoder_codec_context->rc_max_rate = 3000000;
+  encoder_codec_context->rc_buffer_size = 3000000;
+  encoder_codec_context->rc_max_available_vbv_use = 3000000;
+  encoder_codec_context->rc_initial_buffer_occupancy = 3000000;
+
+  encoder_codec_context->profile = FF_PROFILE_H264_HIGH_422;
+  encoder_codec_context->level = 51;
leandromoreira commented 4 years ago

it seems that the x264 message forced frame type (3) at 141 was changed to frame type (1) it is just a warning message as well as the x265 message specified frame type (5) at 120 is not compatible with keyframe interval too.

leandromoreira commented 4 years ago

The message MB rate (125337600) > level limit (2073600) seems to be related to profile/level setup.

leandromoreira commented 4 years ago

People at doom9 continue to help and now I'm digging further to understand if it's a timing issue, in fact there are some timing parameters that are different from the decoder to the encoder.


LOG: Video
LOG: =================================================
LOG: decoder
LOG:    AVFormatContext
LOG:        start_time=0 duration=10022000 bit_rate=2631572 start_time_realtime=0
LOG:    AVCodecContext
LOG:        bit_rate=2228686 ticks_per_frame=2 width=1920 height=1080 gop_size=12 keyint_min=25 sample_rate=0 profile=100 level=42
LOG:        avc->time_base=num/den 0/2
LOG:        avc->framerate=num/den 0/1
LOG:        avc->pkt_timebase=num/den 0/1
LOG:    AVStream
LOG:        index=0 start_time=0 duration=153600
LOG:        avs->time_base=num/den 1/15360
LOG:        avs->sample_aspect_ratio=num/den 1/1
LOG:        avs->avg_frame_rate=num/den 60/1
LOG:        avs->r_frame_rate=num/den 60/1
LOG: =================================================
LOG: Video
LOG: =================================================
LOG: encoder
LOG:    AVFormatContext
LOG:        start_time=0 duration=0 bit_rate=0 start_time_realtime=0
LOG:    AVCodecContext
LOG:        bit_rate=0 ticks_per_frame=1 width=1920 height=1080 gop_size=-1 keyint_min=-1 sample_rate=0 profile=-99 level=-99
LOG:        avc->time_base=num/den 1/15360
LOG:        avc->framerate=num/den 0/1
LOG:        avc->pkt_timebase=num/den 0/1
LOG:    AVStream
LOG:        index=0 start_time=0 duration=0
LOG:        avs->time_base=num/den 1/15360
LOG:        avs->sample_aspect_ratio=num/den 0/1
LOG:        avs->avg_frame_rate=num/den 0/0
LOG:        avs->r_frame_rate=num/den 0/0
LOG: =================================================

LOG: =================================================
LOG: Audio
LOG: =================================================
LOG: decoder
LOG:    AVFormatContext
LOG:        start_time=0 duration=10022000 bit_rate=2631572 start_time_realtime=0
LOG:    AVCodecContext
LOG:        bit_rate=393736 ticks_per_frame=1 width=0 height=0 gop_size=0 keyint_min=0 sample_rate=48000 profile=1 level=-99
LOG:        avc->time_base=num/den 1/48000
LOG:        avc->framerate=num/den 0/1
LOG:        avc->pkt_timebase=num/den 0/1
LOG:    AVStream
LOG:        index=1 start_time=0 duration=480000
LOG:        avs->time_base=num/den 1/48000
LOG:        avs->sample_aspect_ratio=num/den 0/1
LOG:        avs->avg_frame_rate=num/den 0/0
LOG:        avs->r_frame_rate=num/den 0/0
LOG: =================================================

LOG: =================================================
LOG: Audio
LOG: =================================================
LOG: encoder
LOG:    AVFormatContext
LOG:        start_time=0 duration=0 bit_rate=0 start_time_realtime=0
LOG:    AVCodecContext
LOG:        ->NULL
LOG:    AVStream
LOG:        index=1 start_time=0 duration=0
LOG:        avs->time_base=num/den 0/0
LOG:        avs->sample_aspect_ratio=num/den 0/1
LOG:        avs->avg_frame_rate=num/den 0/0
LOG:        avs->r_frame_rate=num/den 0/0
LOG: =================================================
leandromoreira commented 4 years ago

minkey_int kept 31 even when I ran the command line (fixing it to be 60):

ffmpeg -y -i small_bunny_1080p_60fps.mp4 -c:a copy -c:v libx264 -x264opts keyint=60:min-keyint=60:scenecut=-1 -preset fast b_1s_gop.mp4

explanation why minkey_int is 31 here

But the messages forced frame type (3) at 141 was changed to frame type (1) didn't show up, looking at the x264 source code to try to understand this issue.

leandromoreira commented 4 years ago

Command line output:

$ ffmpeg -y -i small_bunny_1080p_60fps.mp4 -c:a copy -c:v libx264 -x264opts keyint=60:min-keyint=60:scenecut=-1 -preset fast b_1s_gop.mp4
ffmpeg version 4.2.1 Copyright (c) 2000-2019 the FFmpeg developers
  built with Apple clang version 11.0.0 (clang-1100.0.33.8)
  configuration: --prefix=/usr/local/Cellar/ffmpeg/4.2.1_2 --enable-shared --enable-pthreads --enable-version3 --enable-avresample --cc=clang --host-cflags='-I/Library/Java/JavaVirtualMachines/adoptopenjdk-13.jdk/Contents/Home/include -I/Library/Java/JavaVirtualMachines/adoptopenjdk-13.jdk/Contents/Home/include/darwin -fno-stack-check' --host-ldflags= --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libbluray --enable-libmp3lame --enable-libopus --enable-librubberband --enable-libsnappy --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-librtmp --enable-libspeex --enable-libsoxr --enable-videotoolbox --disable-libjack --disable-indev=jack
  libavutil      56. 31.100 / 56. 31.100
  libavcodec     58. 54.100 / 58. 54.100
  libavformat    58. 29.100 / 58. 29.100
  libavdevice    58.  8.100 / 58.  8.100
  libavfilter     7. 57.100 /  7. 57.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  5.100 /  5.  5.100
  libswresample   3.  5.100 /  3.  5.100
  libpostproc    55.  5.100 / 55.  5.100
Guessed Channel Layout for Input Stream #0.1 : 5.1
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'small_bunny_1080p_60fps.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    title           : Big Buck Bunny, Sunflower version
    artist          : Blender Foundation 2008, Janus Bager Kristensen 2013
    composer        : Sacha Goedegebure
    encoder         : Lavf58.29.100
    comment         : Creative Commons Attribution 3.0 - http://bbb3d.renderfarming.net
    genre           : Animation
  Duration: 00:00:10.02, start: 0.000000, bitrate: 2631 kb/s
    Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], 2228 kb/s, 60 fps, 60 tbr, 15360 tbn, 120 tbc (default)
    Metadata:
      handler_name    : GPAC ISO Video Handler
    Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, 5.1, fltp, 393 kb/s (default)
    Metadata:
      handler_name    : GPAC ISO Audio Handler
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (native) -> h264 (libx264))
  Stream #0:1 -> #0:1 (copy)
Press [q] to stop, [?] for help
[libx264 @ 0x7ffa78804800] using SAR=1/1
[libx264 @ 0x7ffa78804800] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
[libx264 @ 0x7ffa78804800] profile High, level 4.2
[libx264 @ 0x7ffa78804800] 264 - core 155 r2917 0a84d98 - H.264/MPEG-4 AVC codec - Copyleft 2003-2018 - http://www.videolan.org/x264.html - options: cabac=1 ref=2 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=6 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=6 lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=1 keyint=60 keyint_min=31 scenecut=0 intra_refresh=0 rc_lookahead=30 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to 'b_1s_gop.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    title           : Big Buck Bunny, Sunflower version
    artist          : Blender Foundation 2008, Janus Bager Kristensen 2013
    composer        : Sacha Goedegebure
    genre           : Animation
    comment         : Creative Commons Attribution 3.0 - http://bbb3d.renderfarming.net
    encoder         : Lavf58.29.100
    Stream #0:0(und): Video: h264 (libx264) (avc1 / 0x31637661), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], q=-1--1, 60 fps, 15360 tbn, 60 tbc (default)
    Metadata:
      handler_name    : GPAC ISO Video Handler
      encoder         : Lavc58.54.100 libx264
    Side data:
      cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: -1
    Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, 5.1, fltp, 393 kb/s (default)
    Metadata:
      handler_name    : GPAC ISO Audio Handler
frame=  600 fps= 20 q=-1.0 Lsize=    4039kB time=00:00:09.98 bitrate=3314.2kbits/s speed=0.335x
video:3540kB audio:482kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.427825%
[libx264 @ 0x7ffa78804800] frame I:10    Avg QP:18.63  size:132437
[libx264 @ 0x7ffa78804800] frame P:158   Avg QP:21.70  size:  9399
[libx264 @ 0x7ffa78804800] frame B:432   Avg QP:25.79  size:  1887
[libx264 @ 0x7ffa78804800] consecutive B-frames:  1.8%  3.7%  8.5% 86.0%
[libx264 @ 0x7ffa78804800] mb I  I16..4: 17.9% 70.6% 11.5%
[libx264 @ 0x7ffa78804800] mb P  I16..4:  1.3%  2.3%  0.2%  P16..4: 20.1%  3.6%  1.5%  0.0%  0.0%    skip:71.0%
[libx264 @ 0x7ffa78804800] mb B  I16..4:  0.3%  0.3%  0.0%  B16..8:  7.9%  0.6%  0.0%  direct: 1.7%  skip:89.2%  L0:47.2% L1:49.6% BI: 3.2%
[libx264 @ 0x7ffa78804800] 8x8 transform intra:64.9% inter:82.8%
[libx264 @ 0x7ffa78804800] coded y,uvDC,uvAC intra: 48.8% 69.4% 32.7% inter: 2.1% 3.7% 0.2%
[libx264 @ 0x7ffa78804800] i16 v,h,dc,p: 13% 41%  9% 38%
[libx264 @ 0x7ffa78804800] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 18% 21% 19%  6%  6%  6%  7%  9%  8%
[libx264 @ 0x7ffa78804800] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 24% 19% 11%  7%  8%  7%  7% 10%  6%
[libx264 @ 0x7ffa78804800] i8c dc,h,v,p: 47% 30% 14%  9%
[libx264 @ 0x7ffa78804800] Weighted P-Frames: Y:0.0% UV:0.0%
[libx264 @ 0x7ffa78804800] ref P L0: 73.4% 26.6%
[libx264 @ 0x7ffa78804800] ref B L0: 87.6% 12.4%
[libx264 @ 0x7ffa78804800] ref B L1: 96.4%  3.6%
[libx264 @ 0x7ffa78804800] kb/s:2899.