Slow lossy encoding - Githubissues

IanButterworth commented 3 years ago

@galenlynch Am I doing something wrong, or do we need to look into this before v0.9?

Lossless goes from 4s to 2s 👍🏻 Default lossy goes from 12s to 68s 🤔, and file size is repeatably smaller.. is v0.9 trying harder, and some hidden setting is different?

For imgstack = map(x-> rand(UInt8, 2048, 1536), 1:100)

v0.8.4

julia> @time encodevideo("video.mp4", imgstack, AVCodecContextProperties = [:color_range=>2, :priv_data => ("crf"=>"0","preset"=>"ultrafast")])
Progress: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| Time: 0:00:01
[ Info: Video file saved: /Users/ian/video.mp4
frame=  100 fps= 40 q=-1.0 Lsize=  480574kB time=00:00:04.12 bitrate=954382.6kbits/s speed=1.66x    81x    
[ Info: video:480572kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.000459%
  4.006610 seconds (2.21 k allocations: 340.594 KiB)
"video.mp4"

julia> Base.format_bytes(stat("video.mp4").size)
"469.310 MiB"

julia> @time encodevideo("video.mp4", imgstack, AVCodecContextProperties = [:priv_data => ("crf"=>"23","preset"=>"medium")])
Progress: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| Time: 0:00:05
[ Info: Video file saved: /Users/ian/video.mp4
[ Info: frame=  100 fps=0.0 q=-1.0 Lsize=  127983kB time=00:00:04.04 bitrate=259405.5kbits/s speed=15.7x    
[ Info: video:127981kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.001720%
 11.679009 seconds (2.38 k allocations: 354.469 KiB)
"video.mp4"

julia> Base.format_bytes(stat("video.mp4").size)
"124.984 MiB"

v0.9.0-dev

julia> @time VideoIO.save("video.mp4", imgstack, encoder_options = (color_range=2,crf=0,preset="ultrafast"))
  1.998095 seconds (445 allocations: 7.219 KiB)

julia> Base.format_bytes(stat("video.mp4").size)
"469.316 MiB"

julia> @time VideoIO.save("video.mp4", imgstack, encoder_options = (crf=23, preset="medium"))
 68.076069 seconds (1.15 G allocations: 21.878 GiB, 4.07% gc time)

julia> Base.format_bytes(stat("video.mp4").size)
"118.097 MiB"

Note that I see the same when setting encoder_private_options instead

Side note.. I love how quiet v0.9 is

IanButterworth commented 3 years ago

Looking at each crf=32 video two differences stand out.

250251 kb/s -> 237978 kb/s
1200k tbn -> 12288 tbn

Any idea what's changed @galenlynch? The default encoding settings going from 12s to 68s is a bit strong, so good if we can avoid it

v0.8.4

Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/Users/ian/video_v08.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf58.45.100
  Duration: 00:00:04.17, start: 0.000000, bitrate: 250237 kb/s
    Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1536x2048, 250251 kb/s, 24 fps, 24 tbr, 1200k tbn, 48 tbc (default)
    Metadata:
      handler_name    : VideoHandler
At least one output file must be specified

v0.9.0-DEV

Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/Users/ian/video_v09.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf58.45.100
  Duration: 00:00:04.17, start: 0.000000, bitrate: 237962 kb/s
    Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1536x2048, 237978 kb/s, 24 fps, 24 tbr, 12288 tbn, 48 tbc (default)
    Metadata:
      handler_name    : VideoHandler

galenlynch commented 3 years ago

oof that's unfortunate. I haven't noticed anything similar to this. I normally use lossy encoding.

galenlynch commented 3 years ago

Oh I might have an idea of what's happening, though I haven't done much testing myself so it could be wrong.

I haven't tracked it down, but I think this is related to #283. In your lossy example for both versions you didn't specify the color range of the video, so the ffmpeg default is to use the mpeg color range (limited) and not the jpeg (full). Previously, any input values outside of the limited range would simply be clipped (which I think is "wrong"). In the new version of the code, if you don't specify the input color range to VIdeoIO it will assume you're using the full color range, and rescale your inputs to the limited color range to avoid clipping. If you specify either input_colorspace_details or color_range in the new version you can avoid this automatic rescaling.

Some simple testing suggests this is the case. Both of the following are on the new version:

@benchmark VideoIO.encode_mux_video("video_new.mp4", imgstack; encoder_settings = (crf = 23, preset = "medium"))
BenchmarkTools.Trial: 
  memory estimate:  21.88 GiB
  allocs estimate:  1153638937
  --------------
  minimum time:     57.333 s (6.07% GC)
  median time:      57.333 s (6.07% GC)
  mean time:        57.333 s (6.07% GC)
  maximum time:     57.333 s (6.07% GC)
  --------------
  samples:          1
  evals/sample:     1

Here I specify the output color_range to be 2 (aka jpeg, full) to avoid the automatic rescaling:

@benchmark VideoIO.encode_mux_video("video_new.mp4", imgstack; encoder_settings = (color_range = 2, crf = 23, preset = "medium"))
BenchmarkTools.Trial: 
  memory estimate:  1.19 KiB
  allocs estimate:  38
  --------------
  minimum time:     14.814 s (0.00% GC)
  median time:      14.814 s (0.00% GC)
  mean time:        14.814 s (0.00% GC)
  maximum time:     14.814 s (0.00% GC)
  --------------
  samples:          1
  evals/sample:     1

So I think this is the unfortunate consequence of not "damaging" the input during encoding more than you need to. You can always opt out by either using the expanded color range in the encoding, or telling VideoIO that your inputs are already in the limited color range (and if they're not that you're ok with just clipping the values outside of that range).

A side note on performance: it's best to use scanline_major = true where possible, assuming you can generate the data that way.

galenlynch commented 3 years ago

Just to clarify, the new version will rescale only if there's a mismatch between the input color range (matrices that the user supplies) and the encoding color_range. In your example there was, so everything had to be rescaled prior to encoding, causing the slowdown.

galenlynch commented 3 years ago

There's clearly something wrong with the rescaling code's performance. The allocations are ridiculous. I can track down the problem: very likely type instability.

galenlynch commented 3 years ago

Oh yeah, I'm remembering thinking about this when I wrote it. The way that gray values are currently rescaled has some inherent type instability in it. Fixing it won't be low-hanging fruit, unfortunately.

The relevant function is here: https://github.com/JuliaIO/VideoIO.jl/blob/b6790fa391526563c8f3678b3dd921057cefe569/src/frame_graph.jl#L153-L159

Right now the source and destination types for the rescaling function are inherently unstable since they're based on enums from ffmpeg. We could lock them in when the writer object is made to reduce the type instability, but that would require a bit of "doing," and reduce flexibility. Alternatively we could get rid of this whole gray-rescaling stuff and try to use ffmpeg's sws_scale for this. I was having a fair amount of difficulty getting sws_scale to accurately rescale monochrome input, so wrote this gray recsaling stuff out of frustration. I figured at the time that getting the "right" results was worth the slowdown, and didn't want to spend a lot of time making it hyper-optimized or coercing sws_scale to do what I wanted since I had already put so much time into the PR, and figured it could be left for later.

Don't know what to do here. It would take some effort to make this faster, and I probably won't be able to get to it that soon. I personally still feel like the accuracy outweighs the slowdown, but it's a subjective tradeoff.

galenlynch commented 3 years ago

Actually, it wouldn't be that hard to make the existing rescaling code that performant by locking in the types when the object is created, but I remember thinking it was a bad idea, though I can't remember why...

IanButterworth commented 3 years ago

Quick thought. Are there existing enums we can provide for color_range, or shall we create some, to make it clearer what the default is/options are?

galenlynch commented 3 years ago

That's down to the way ffmpeg is wrapped with Clang.jl... color_range is either 1 or 2, but it's defined in ffmpeg and I don't think we should redefine it. However, more recent versions of Clang.jl make C enums into Julia cenums, which IMO are much easier to use. I spent some time trying to generate new bindings for FFMPEG, since the current gen code no longer works, but ended up getting too busy to see it through to completion.

galenlynch commented 3 years ago

Also I don't think that would solve the type instability, even though it would be easier to use. At the end of the day the source and destination types are determined by looking at integers stored in FFMPEG's AVFrame structs, and seeing if it's a 1 or 2.

galenlynch commented 3 years ago

Whoops, I got mixed up... the relevant enum for the source and destination types is not the color_range, but instead the format field which is a AV_PIX_FMT enum.

galenlynch commented 3 years ago

One more thing... here color_range is just like any other option passed to ffmpeg on the command line. You could equivalently used encoder_options = (color_range = "jpeg",) and it should still work, since ffmpeg is parsing the encoder options and not VideoIO. I think that maximizes flexibility of VideoIO and makes it easier to use for people familiar with ffmpeg, and also allows us to lean on ffmpeg's documentation more. The internal color_range enum isn't actually exposed to the user anywhere.

IanButterworth commented 3 years ago

Given the default for input_colorspace_details is to assume full color range https://github.com/JuliaIO/VideoIO.jl/blob/76a0f64db04693392b1f48bd8f491a5a62124513/src/encoding.jl#L442-L449

I think it's ok for us to switch to a default of color_range = "jpeg" and make that clear in the docs/changelog. i.e. something like "VideoIO now defaults to full color range, and assumes the input is full color range"

Basically it would make VideoIO more focused on numerical accuracy, than perceptual compression efficiency.

But that requires a little namedtuple/dict fanangaling as it's not simply a kwarg that we can set a default for.

That would buy us time to optimize the scaling function, by making it not invoked by default/most use cases.

galenlynch commented 3 years ago

A lot of video players don't accept jpeg color range videos.

IanButterworth commented 3 years ago

ok. I think a narrative of this for the changelog would be ok:

The default encoder settings got a little (x%) slower, because we now make the assumption that 1) the input data is full color range, 2) you want a video that will play in most video players. Therefore a color space transform will take place to compress the full color range input, to the limited "mpeg" color space.

You have a few options to get faster encoding: 1) Specify that the input data is already scaled to the limited range (any values outside the range will be clipped) by setting... 2) Specify that you want to generate a full color range video by setting `encoder_options=(color_range=2,)`, but note that your video may not play in some video players

As long as we can make the default only a little slower. I haven't delved into the code for the scaling yet, but may have time this weekend to do so.

By the way, do you use the julia slack? It might be good for us to chat more informally on there about this stuff

galenlynch commented 3 years ago

I... might have an account? I'll try to dig it up.

IanButterworth commented 3 years ago

Guidance now added to CHANGELOG.md on how to handle/avoid this

JuliaIO / VideoIO.jl

Slow lossy encoding #308

v0.8.4

v0.9.0-dev