lisamelton / video_transcoding

Tools to transcode, inspect and convert videos.
MIT License
2.38k stars 160 forks source link

Proposal to add an average variable bitrate (AVBR) ratecontrol system for x264 #248

Closed lisamelton closed 5 years ago

lisamelton commented 5 years ago

Proposal to add an average variable bitrate (AVBR) ratecontrol system for x264

In my never-ending quest to explore video transcoding, I've decided to revisit a disparaged ratecontrol system. In fact, it's the same system that I used in my original Bash-based scripts published back in 2014. So I'm the one who disparaged and then abandoned it. :)

That old ratecontrol system was based on the average bitrate (ABR) system built into the x264 encoder, just like the --abr option in transcode-video.

However, instead of constraining the maximum bitrate to reduce fluctuations like my --abr implementation does, it raised tolerance of missing the average bitrate to the maximum amount, disabling overflow detection completely.

So while the final bitrate would still be somewhat near the target, it allowed ABR much more variability to render complex scenes, making it behave more like a CRF-based encode. Unfortunately, that variability could also starve other quieter passages of needed bitrate, causing blockiness, color banding and other artifacts. Yikes!

But I've now figured out a very simple fix for all those quality problems. And the results are quite stunning.

The bad news is that both this old ratecontrol system and the new fix require the x264 encoder. Which means it's not available for HEVC/H.265 output.

The good news is that it works really well with the --quick or even --target small options and arguments.

Here's how I would describe the option to enable this new ratecontrol system within the built-in --help output of transcode-video:

    --x264-avbr     use average variable bitrate (AVBR) ratecontrol
                      (size near target with different quality than default)
                      (only available with x264 and x264_10bit encoders)

Why --x264-avbr? Well, I had to call it something. :) And I wanted to make sure it's not confused with --abr or the deprecated --vbr and that's it being x264-only is very clear.

But I'm open to a better name! So suggestions are welcome.

Since this new implementation is still a proposal, you can try it now using a wrapper script:

https://gist.github.com/donmelton/faf0fc60932fdcce8338bad7e909788f

Download the script and make sure it's executable and in your $PATH. Then use it like this:

experimental-x264-avbr-transcode-video.sh "/path/to/Movie.mkv"

You can also try it without the script by passing these options and arguments on the command line like this:

transcode-video \
    --abr \
    --encoder-option '_vbv-maxrate' \
    --encoder-option '_vbv-bufsize' \
    --encoder-option '_nal-hrd' \
    --encoder-option 'ratetol=inf' \
    --encoder-option 'mbtree=0' \
    "/path/to/Movie.mkv"

And to document the voodoo that I'm using to make "AVBR" work, here are the option-by-option details:

It's crazy, but it works!

Anyway, let me know what you think. Thanks.

samhutchins commented 5 years ago

I've not had a chance to do more testing since I saw this before, but I thought I'd put that feedback here as well:

I've done some comparisons with Prometheus, specifically looking at the dream sequence and the minute or so just after it. I've been comparing 3 ratecontrol systems: your default special one, --abr, and this new experimental one.

In terms of quality, I think this is somewhere between default and --abr. The dream sequence itself looks pretty damn similar to my eye with all 3 systems, but the dark scene immediately following is where the differences show themselves. The default system handles it well, as expected. This experimental one has a little banding banding, but is fine otherwise. --abr is the worst though, with fairly obvious blocking and banding all over the frame. I think 2 pass is needed in this film for --abr to do an acceptable job.

One of the things I'm conscious of is peak bitrate, because I predominantly target a streaming environment through Plex. With that in mind, I wanted to try and measure the peak bitrate, but I'm not really aware of any tools to do this. My (far from perfect) approach has been to transocde the film using each of the 3 systems in turn, then use ffmpeg to extract the h264 bitstream and split it into 1 second chunks named by their timestamp. I do this with the following command: ffmpeg -i input.mkv -map 0:v -c copy -f segment -segment_time 1 -break_non_keyframes 1 segments\seg%d.264. I can then sort by filesize and get a rough sense of where the bits get allocated. As measured by this approach, the default system has the highest peak bitrate, this experimental one is in second place, and abr has the lowest peak. All as expected really. What I found interesting is where that peak is in the film: for the default system it's about 5 minutes in, as expected. It's also consistently high in that sequence, ie. all the largest chunks are in that sequence. The experimental one is similar, but far less consistent. A lot of bits are also used in the first few seconds of the film, as well as a holographic scene about 30 minutes in. --abr is all over the place, loads of bits are used at the beginning and end, and there's not really much of a peak around 5 minutes.

So far I've not had any problems with using --abr though, and I've got the CPU power and patience to do 2 pass on the sources that need it and it's all worked out fine so far. I think I've reached an "if it ain't broke, don't fix it" stage in my transcoding journey, so I'm unlikely to start using this avbr system. But hey, I'm happy to be conviced anyway. I'll try and analyse at least on more film this week, but I'd say go ahead and put it in

lisamelton commented 5 years ago

@samhutchins Thanks, Sam! And that is a clever peak bitrate detection scheme!

As I mentioned in our email, "Prometheus (2012)" was one of my test cases for this since it's so problematic with single-pass --abr. And my goal with these ratecontrol schemes is for them to be used in a single pass.

With the new hardware-based transcoders behaving just like --abr and being almost as good (or even better in a few cases), I thought it was time to rethink things and see what could be improved.

For example, the confetti scene in the third chapter of "The Great Gatsby (2013)" is much better with AVBR. As is the beginning of any episode of "Game of Thrones" with that damn HBO logo. Those were also test cases.

But I don't expect everyone to switch over to it. :)

samhutchins commented 5 years ago

Yeah, the newer generation of the hardware encoders seem pretty damn good. I've been very impressed with the quality of (what I presume is) Intel's QSV in the 2016 macbook pro I have access to. Perhaps I'll get a new GPU soon, with a faster and better hardware encoder than my GTX660.

I think if I had a PC with QSV (of if my GPU were faster than x264) I'd probably use that at this point. The speed is nice.

lisamelton commented 5 years ago

@samhutchins Be warned that the big problem with the QSV hardware encoder is that it's prone to color banding in some cases. For example, there are scenes in "Blade Runner 2049 (2017)" and "Arrival (2016)" where that's easily noticeable.

Also, the neither the FFmpeg library nor VideoToolbox on macOS seem to allow access to two-pass modes, if there are any.

Speaking of two passes... I think I mentioned in our email thread that disabling Macroblock-tree ratecontrol built into x264 makes unconstrained ABR work very similarly to the constrained ABR I implemented with the --abr option. Meaning you can do some thing like this:

transcode-video \
    --abr \
    --encoder-option '_vbv-maxrate' \
    --encoder-option '_vbv-bufsize' \
    --encoder-option '_nal-hrd' \
    --encoder-option 'mbtree=0' \
    "/path/to/Movie.mkv"

...and get very good results. (Notice there's no ratetol=inf in that command.)

Why does this work?

Well, using mbtree=0 actually reduces compression efficiency and lowers the quality of I-frames, i.e. the the complete-picture keyframes. But it raises the quality of P-frames, the forward-predicted partial frames which normally outnumber I-frames 20-to-1.

And because P-frames are higher quality, with less color banding and blockiness, they're larger, i.e. they have a higher bitrate. Because they're larger, they lower bitrate spikes elsewhere.

So, basically I'm lowering quality in order to raise it. :)

But what does this have to do with two passes?

Well, remember when I told you last year that using two passes with --abr is not a guarantee of quality? Especially with very dynamic content? The example I used was "Saving Private Ryan (1998)," particularly the dark "church scene" in Chapter 9. Because so much bitrate is used elsewhere in the movie, that scene is starved when you encode it with two passes and, really, it looks like ass. Even if you raise the bitrate target.

But if you disable the bitrate constraints and, more importantly, Macroblock-tree ratecontrol like this:

transcode-video \
    --abr \
    --encoder-option '_vbv-maxrate' \
    --encoder-option '_vbv-bufsize' \
    --encoder-option '_nal-hrd' \
    --encoder-option 'mbtree=0' \
    --handbrake-option 'two-pass' \
    --handbrake-option 'turbo' \
    "/path/to/Movie.mkv"

... then the "church scene" looks fine. Even using the default target of 6000 Kbps.

BTW, I wouldn't advise leaving the lowered VBV constraints if you also disable Macroblock-tree ratecontrol. Which is why I disabled them in that command.

Anyway, give it a try and let me know what you think.

elliotclowes commented 5 years ago

@donmelton As someone who always does a two-pass --abr this is all fascinating and I look forward to experimenting over the coming days with these settings.

@samhutchins By the way, if you play a file in VLC and click Window > Media Information > Statistics if will you show you the 'Stream bitrate' as it plays. It doesn't seem to be super accurate, but it can give you a rough idea of the peaks and troughs.

lisamelton commented 5 years ago

@samhutchins BTW, since I'm not using the ratetol=inf trick in those single- and two-pass commands, that means the output is HRD compliant. So I could still signal Hypothetical Reference Decoder (HRD) information in metadata. Which makes --encoder-option '_nal-hrd' unnecessary.

Of course, the maximum bitrate will be 25 Mbps because that's the default since transcode-video will pass --encoder-level=4.0 to HandBrakeCLI for all 1080p content. Which means there still is a default VBV of 25000 for maxrate and 31250 for bufsize.

lisamelton commented 5 years ago

@elliotclowes Great! Let me know what you think. You can spam me, @samhutchins and everyone else here with your results. Seriously, I love this kind of interaction!

lisamelton commented 5 years ago

Apparently Intel QSV, the hardware encoder on most Macs, has a ratecontrol method called "AVBR" (unfortunately not available through VideoToolbox). But, based on the description, it works damn near identical to what I'm doing with this proposal.

Which means the name I picked is a good choice. :) But I'm seriously thinking about just shortening the option from "--x264-avbr" to "--avbr" because it's possible that the x265 team may one day restore the rate tolerance API that they removed (for no reason that I can determine), thus allowing me to implement this for HEVC output.

So, what does everyone think about "--avbr" as an option name?

lisamelton commented 5 years ago

@elliotclowes BTW, that "Statistics" feature of VLC is great! Thanks for the tip.

lisamelton commented 5 years ago

OK, after using the bitrate detection mechanism suggested by @samhutchins and the VLC "Statistic" feature suggested by @elliotclowes, plus a little of my own hacking, I've come to the conclusion that even with strict HRD compliance, the x264 encoder seems to exceed the VBV maximum bitrate on occasion. But in all these cases, the encoded output was still playable on underpowered devices like the Roku. Of course, even non-HRD compliant output from my "AVBR" system is still playable on those devices.

But not applying any VBV at all can make x264 encoded output unplayable or stutter during playback.

The good news is that transcode-video always applies the VBV to x264 output whether maxrate is constrained or not. Again, this is because transcode-video sets the encoder level and HandbrakeCLI applies a default VBV anyway.

lisamelton commented 5 years ago

@elliotclowes and @samhutchins So, after some testing of my own with other movies besides "Saving Private Ryan (1998)," I've concluded that using two-pass unconstrained ABR ratecontrol with the Macroblock tree disabled... is not a good idea.

For some strange reason, the two-pass algorithm in x264 requires Macroblock-tree ratecontrol to handle certain textures during bitrate spikes.

But, oddly enough, single-pass unconstrained ABR with Macroblock-tree ratecontrol disabled works fine in all those cases where two-pass ABR glitches. Go figure.

What surprised me the most, though, was when I accidentally did a few transcodings using single-pass unconstrained ABR with Macroblock-tree ratecontrol enabled, and they looked fine. WTF!?

In other words, I used a command like this:

transcode-video \
    --abr \
    --encoder-option '_vbv-maxrate' \
    --encoder-option '_vbv-bufsize' \
    --encoder-option '_nal-hrd' \
    "/path/to/Movie.mkv"

It seems the x264 development team has significantly improved the raw ABR algorithm over the last few years. It's not perfect, of course, but it looks damn good for about 90% of what I've tried so far.

In fact, I'm probably going to add a double-secret, hidden --raw option to transcode-video just so I don't have to type all that crap in the future when I do more testing. :)

samhutchins commented 5 years ago

@donmelton undocumented options for the win!

I'm sorry I've not tested more films with --avbr, I've been busy all damn week. I'm glad you're able to do all this testing for us.

All this discussion has also shown me that I need to do some reading on vbv and hrd. In any case, as I said above, I'mma stick with 2-pass abr for now, open to switching to --avbr or default if I can convince myself it won't break anything in my streaming environment.

--avbr as an option name seems fine to me, --x264-avbr seems a little long. In any case, it needs more explaining than --abr, I think it's similar to --simple in that respect.

lisamelton commented 5 years ago

@samhutchins You are forgiven for having a real life. :)

As for my unreal life centered around this particular obsession... I've managed to transcode around 250 Blu-rays and a half dozen DVDs using --avbr so far. And I'm happy with the results. Nothing stands out as problematic yet. But you are wise to stick with ratecontrol methods you like and trust.

There is much to learn about about the VBV and HRD compliance for me, too. After observing what x264 looks like it is doing, I need to spend more time digging into the actual source code of x264 to find out for sure.

And before I add --avbr (or even --raw) I think I may restructure the code to better switch between ratecontrol systems instead of adding another shitload of boolean flags. That needs a little cleanup.

khaosx commented 5 years ago

@donmelton We all appreciate your obsessiveness. I mean, we'd all be up that well known crew if your obsession was golf.

My primary transcoding machine has been down for a week, but I expect to have it back running tomorrow. I've got a stack of files ready to run, so I'll start testing the --avbr settings and report back.

lisamelton commented 5 years ago

@khaosx LOL! :)

But yikes! I hope there's nothing seriously wrong with your hardware.

And thanks for testing!

khaosx commented 5 years ago

Nah, it's all in the name of science. I bought a Nvidia Quadro P2000 to start playing with hardware transcoding in Plex. I needed to reconfigure my server to build out the lab, so I had to use the transcoder as a backup. I'm just waiting on the Plex/Linux stars to align on that one, so I can now bring up the transcoder in its proper role.

tl;dr - I'm a big dumb dummy head who is trying to run too many projects in the home lab, and not accomplishing anything :)

lisamelton commented 5 years ago

@khaosx Then you are not alone with your strategy, sir. I've done similar things many times to my own setup. :)

It's funny you mention hardware transcoding with Plex. I just recently enabled that on one of my Plex servers (a older macOS box) and it worked extremely well. Now I'm kicking myself for not realizing I could have done that months ago.

lisamelton commented 5 years ago

OK, the changes are in to support this proposal. But first I did that housekeeping I described to @samhutchins earlier:

Then it was actually implement AVBR:

And finally a little undocumented testing API for me and people in the know, like all of you:

I'll probably release all this and Sam's nifty --mixdown option sometime next week after more people get a chance to test AVBR. Let's not be too hasty, after all.

khaosx commented 5 years ago

@donmelton Damn hoss...this looks GREAT. I've done a few test runs so far, the most impressive being chapter 13 of Red State. Lots of different action levels, jumpy camera work and cuts. I'll post back if I see anything out of whack (got another batch of 15 to do today), but it looks like a big improvement so far.

lisamelton commented 5 years ago

@khaosx Thanks for testing and I'm so glad you like it!

lisamelton commented 5 years ago

Here's the text I plan on appending to the "Explanation" section of the README document:


How my average variable bitrate (AVBR) ratecontrol system works

My average variable bitrate (AVBR) ratecontrol system, selected via the --avbr option, is also based on the ABR algorithm already within x264 which targets a specific bitrate.

But the maximum bitrate is not constrained like my ABR system.

Instead, the tolerance of missing the average bitrate is raised to the maximum amount, disabling overflow detection completely. This makes the ABR algorithm behave much more like a CRF-based encode, so final bitrates can be 10-15% higher or lower than the target.

And to prevent bitrates from getting too low, the Macroblock-tree ratecontrol system built into x264 is disabled. While this does lower compression efficiency somewhat, it significantly reduces blockiness, color banding and other artifacts.

Unfortunately, these modifications to implement AVBR are not possible when using x265.


Let me know what you think.

lisamelton commented 5 years ago

Although I don't have a copy of "Red State (2011)" like @khaosx, I've now transcoded about 300 Blu-ray rips with AVBR and the --quick option. Including some really difficult videos like:

And I'm not seeing any issues!

But I haven't transcoded many DVD rips yet since I don't keep many around anymore. Has anyone else tried AVBR on DVD-sized content?

Anyway, I'm getting real close to releasing this. Maybe later today or tomorrow.

Last call to @samhutchins, @elliotclowes, @khaosx, @Mattrak, @ttyso or anyone else for additional comments. And thanks for all the feedback so far!

lisamelton commented 5 years ago

Can't believe I forgot to do this:

Plus, transcode-video will now fail with a useful error message when using x265 or its brethren.

khaosx commented 5 years ago

@donmelton I haven't tested any DVD's, but for BluRay content, thumbs up from me. I've tested it across my entire workflow - video quality is improved, no unintended consequences with sound or subtitles. Thunderbirds are GO!

lisamelton commented 5 years ago

@khaosx LOL! I was so hoping one of you would use that exact quote to endorse it. :)

samhutchins commented 5 years ago

I don't have much in the way of DVD content any more, but what little I have looks good. Thumbs up from me

lisamelton commented 5 years ago

@samhutchins Thanks! I need to start to ripping some old DVDs again. :)

lisamelton commented 5 years ago

OK, gang, this feature has been released in version 0.23.0. Thanks again for all the feedback and testing!

khaosx commented 5 years ago

Da-da de-de da-da da-da da-HOCUS CADABRA!!!! I've necro'd the thread!

@donmelton and everyone else:

I just re-encoded seasons 1 and 2 of Game of Thrones, and they are flawless with the new system. I've done probably 60 movies so far, and they're great, but as noted before, the opening static fade for all HBO shows is usually a dead giveaway for cluster-fuckery in the system. Not anymore! Nice work, Big Daddy Don!

lisamelton commented 5 years ago

@khaosx You are very welcome, Kristopher! I'm so glad you like AVBR and thanks for that feedback!!!

I've redone my whole video collection with it so I agree with your evaluation.

Now, you're gonna hate this, but... :)

I found a way to actually improve AVBR, specifically for the case of the video immediately following that damn HBO logo (and for that dark scene in "Prometheus (2012)" that @samhutchins loves to test).

And I was in the middle of writing an issue describing that "one weird trick" (my name for it :) to ask for testing and feedback on it when I got your notification.

Seriously.

I can tell you that technique now or, if you want, you can wait for the whole writeup. Because you were definitely on my list of folks to "cc" when it's ready.

khaosx commented 5 years ago

I can tell you that technique now or, if you want, you can wait for the whole writeup. Because you were definitely on my list of folks to "cc" when it's ready.

I'll wait for the whole write up, but only because I have family obligations all weekend. Looking forward to whatever voodoo you've got cooked up!