Closed IanButterworth closed 3 years ago
Looking at each crf=32 video two differences stand out.
250251 kb/s
-> 237978 kb/s
1200k tbn
-> 12288 tbn
Any idea what's changed @galenlynch? The default encoding settings going from 12s to 68s is a bit strong, so good if we can avoid it
v0.8.4
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/Users/ian/video_v08.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf58.45.100
Duration: 00:00:04.17, start: 0.000000, bitrate: 250237 kb/s
Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1536x2048, 250251 kb/s, 24 fps, 24 tbr, 1200k tbn, 48 tbc (default)
Metadata:
handler_name : VideoHandler
At least one output file must be specified
v0.9.0-DEV
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/Users/ian/video_v09.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf58.45.100
Duration: 00:00:04.17, start: 0.000000, bitrate: 237962 kb/s
Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1536x2048, 237978 kb/s, 24 fps, 24 tbr, 12288 tbn, 48 tbc (default)
Metadata:
handler_name : VideoHandler
oof that's unfortunate. I haven't noticed anything similar to this. I normally use lossy encoding.
Oh I might have an idea of what's happening, though I haven't done much testing myself so it could be wrong.
I haven't tracked it down, but I think this is related to #283. In your lossy example for both versions you didn't specify the color range of the video, so the ffmpeg default is to use the mpeg color range (limited) and not the jpeg (full). Previously, any input values outside of the limited range would simply be clipped (which I think is "wrong"). In the new version of the code, if you don't specify the input color range to VIdeoIO it will assume you're using the full color range, and rescale your inputs to the limited color range to avoid clipping. If you specify either input_colorspace_details
or color_range
in the new version you can avoid this automatic rescaling.
Some simple testing suggests this is the case. Both of the following are on the new version:
@benchmark VideoIO.encode_mux_video("video_new.mp4", imgstack; encoder_settings = (crf = 23, preset = "medium"))
BenchmarkTools.Trial:
memory estimate: 21.88 GiB
allocs estimate: 1153638937
--------------
minimum time: 57.333 s (6.07% GC)
median time: 57.333 s (6.07% GC)
mean time: 57.333 s (6.07% GC)
maximum time: 57.333 s (6.07% GC)
--------------
samples: 1
evals/sample: 1
Here I specify the output color_range to be 2 (aka jpeg, full) to avoid the automatic rescaling:
@benchmark VideoIO.encode_mux_video("video_new.mp4", imgstack; encoder_settings = (color_range = 2, crf = 23, preset = "medium"))
BenchmarkTools.Trial:
memory estimate: 1.19 KiB
allocs estimate: 38
--------------
minimum time: 14.814 s (0.00% GC)
median time: 14.814 s (0.00% GC)
mean time: 14.814 s (0.00% GC)
maximum time: 14.814 s (0.00% GC)
--------------
samples: 1
evals/sample: 1
So I think this is the unfortunate consequence of not "damaging" the input during encoding more than you need to. You can always opt out by either using the expanded color range in the encoding, or telling VideoIO that your inputs are already in the limited color range (and if they're not that you're ok with just clipping the values outside of that range).
A side note on performance: it's best to use scanline_major = true
where possible, assuming you can generate the data that way.
Just to clarify, the new version will rescale only if there's a mismatch between the input color range (matrices that the user supplies) and the encoding color_range
. In your example there was, so everything had to be rescaled prior to encoding, causing the slowdown.
There's clearly something wrong with the rescaling code's performance. The allocations are ridiculous. I can track down the problem: very likely type instability.
Oh yeah, I'm remembering thinking about this when I wrote it. The way that gray values are currently rescaled has some inherent type instability in it. Fixing it won't be low-hanging fruit, unfortunately.
The relevant function is here: https://github.com/JuliaIO/VideoIO.jl/blob/b6790fa391526563c8f3678b3dd921057cefe569/src/frame_graph.jl#L153-L159
Right now the source and destination types for the rescaling function are inherently unstable since they're based on enums from ffmpeg. We could lock them in when the writer object is made to reduce the type instability, but that would require a bit of "doing," and reduce flexibility. Alternatively we could get rid of this whole gray-rescaling stuff and try to use ffmpeg's sws_scale for this. I was having a fair amount of difficulty getting sws_scale to accurately rescale monochrome input, so wrote this gray recsaling stuff out of frustration. I figured at the time that getting the "right" results was worth the slowdown, and didn't want to spend a lot of time making it hyper-optimized or coercing sws_scale to do what I wanted since I had already put so much time into the PR, and figured it could be left for later.
Don't know what to do here. It would take some effort to make this faster, and I probably won't be able to get to it that soon. I personally still feel like the accuracy outweighs the slowdown, but it's a subjective tradeoff.
Actually, it wouldn't be that hard to make the existing rescaling code that performant by locking in the types when the object is created, but I remember thinking it was a bad idea, though I can't remember why...
Quick thought. Are there existing enums we can provide for color_range
, or shall we create some, to make it clearer what the default is/options are?
That's down to the way ffmpeg is wrapped with Clang.jl... color_range
is either 1 or 2, but it's defined in ffmpeg and I don't think we should redefine it. However, more recent versions of Clang.jl make C enums into Julia cenum
s, which IMO are much easier to use. I spent some time trying to generate new bindings for FFMPEG, since the current gen code no longer works, but ended up getting too busy to see it through to completion.
Also I don't think that would solve the type instability, even though it would be easier to use. At the end of the day the source and destination types are determined by looking at integers stored in FFMPEG's AVFrame structs, and seeing if it's a 1 or 2.
Whoops, I got mixed up... the relevant enum for the source and destination types is not the color_range
, but instead the format
field which is a AV_PIX_FMT
enum.
One more thing... here color_range
is just like any other option passed to ffmpeg on the command line. You could equivalently used encoder_options = (color_range = "jpeg",)
and it should still work, since ffmpeg is parsing the encoder options and not VideoIO. I think that maximizes flexibility of VideoIO and makes it easier to use for people familiar with ffmpeg, and also allows us to lean on ffmpeg's documentation more. The internal color_range
enum isn't actually exposed to the user anywhere.
Given the default for input_colorspace_details
is to assume full color range
https://github.com/JuliaIO/VideoIO.jl/blob/76a0f64db04693392b1f48bd8f491a5a62124513/src/encoding.jl#L442-L449
I think it's ok for us to switch to a default of color_range = "jpeg"
and make that clear in the docs/changelog. i.e. something like "VideoIO now defaults to full color range, and assumes the input is full color range"
Basically it would make VideoIO more focused on numerical accuracy, than perceptual compression efficiency.
But that requires a little namedtuple/dict fanangaling as it's not simply a kwarg that we can set a default for.
That would buy us time to optimize the scaling function, by making it not invoked by default/most use cases.
A lot of video players don't accept jpeg color range videos.
ok. I think a narrative of this for the changelog would be ok:
The default encoder settings got a little (x%) slower, because we now make the assumption that 1) the input data is full color range, 2) you want a video that will play in most video players. Therefore a color space transform will take place to compress the full color range input, to the limited "mpeg" color space.
encoder_options=(color_range=2,)
, but note that your video may not play in some video playersAs long as we can make the default only a little slower. I haven't delved into the code for the scaling yet, but may have time this weekend to do so.
By the way, do you use the julia slack? It might be good for us to chat more informally on there about this stuff
I... might have an account? I'll try to dig it up.
Guidance now added to CHANGELOG.md on how to handle/avoid this
@galenlynch Am I doing something wrong, or do we need to look into this before v0.9?
Lossless goes from 4s to 2s 👍🏻 Default lossy goes from 12s to 68s 🤔, and file size is repeatably smaller.. is v0.9 trying harder, and some hidden setting is different?
For
imgstack = map(x-> rand(UInt8, 2048, 1536), 1:100)
v0.8.4
v0.9.0-dev
Note that I see the same when setting
encoder_private_options
insteadSide note.. I love how quiet v0.9 is