in03 / proxima

Transcode source media directly from DaVinci Resolve using multiple machines for encoding. Great for creating proxies quickly.
MIT License
50 stars 3 forks source link

fix: Data levels issues #209

Closed github-actions[bot] closed 2 years ago

github-actions[bot] commented 2 years ago

Problem

I'm a little confused about this. Not sure what's going on.

If data levels are set to auto (interpreted as "full" I guess), proxies are still being exported with limited range and they appear to have much more contrast; crushed blacks and blown-out whites. It doesn't happen for proxy media generated internally.

Add support

If we can get ffmpeg to encode with the correct data levels, we still need to know what the input levels are. Since clip properties returned through the api can have levels set to auto, we'll have to alert the user that they should probably set the clips explicitly to "full" or "video".

ProRes support hurdles

There's also a concern about using ProRes. ProRes doesn't have a flag to tell the player how the colour data should be interpreted. I believe Resolve 18 allows for ProRes proxies, so maybe we should see how they're handling it? Some additional custom metadata? Some internal forced level interpretation? A new implementation for 18 altogether?

If special handling is required for some codecs like ProRes, we may need to add some data-levels settings to the proposed render-presets. Something like:

- proxy_settings:
    - data_levels: ["inherit", "video", "full"] 
    - filters:
        - data_levels_whitelist: ["full", "video", "auto"]

Additional Resources

closes #200

in03 commented 2 years ago

A lot of research going into this issue when I have the time!

Check out: https://www.thepostprocess.com/2019/09/24/how-to-deal-with-levels-full-vs-video/

So far it seems like the issue is inconsistent across projects. It's not an 8-bit / 10-bit issue like I originally thought it was. My bad. Internally generated 8-bit footage data levels interpreted correctly, but superwhites and superblacks clamped. It appears to be that proxies generated through ffmpeg are being given video-levels when they should be full. Likely one of our cameras has had its levels set to full (0-255) instead of (16-235) so the ranges written differ from the metadata and that's why it's inconsistent? Just a guess. Will continue testing and see.

in03 commented 2 years ago

TL;DR

Alright! Finally figured it out.... Looks like Resolve is interpreting input files correctly. If a camera is set to use full range levels (0-255) with 10-bit, colours are interpreted correctly by Resolve, if the levels are set to limited (16-235) the colours are interpreted incorrectly. This isn't Resolve's fault, it's reading the metadata correctly, the metadata is just not helping.

Ffmpeg on the other hand is not interpreting the input levels correctly and sets them to limited every time. What also doesn't help is that Resolve will only interpret the levels of proxies as limited. To get around this we can set input levels manually using color_range as reported by ffprobe:

Finally, out_range should always be set to limited to match Resolve's expectations of proxies. So for 10-bit full levels h2.65 we can set, -vf scale=in_range=full:out_range=limited. If our source file levels were set incorrectly in metadata or interpreted incorrectly by Resolve for some reason, we can get Proxima to render proxies matching data levels as set in Resolve's clip attributes.

The inconsistency I was experiencing was due to two cameras shooting with different levels settings. The one set to limited levels always matched the limited proxies. Bit obvi in hindsight.

Testing

I tested so many codecs... Resolve, Adobe Media Encoder, raw ffmpeg, Shutter Encoder, etc, greped and diffed metadata and colour metadata was identical between files that were interpreted differently by Resolve. A very confusing time!

Encode

This is essentially the command I was testing with Proxima under the hood. Using ProRes 422 Proxy instead of DNxHR SQ to ensure 10-bit and low bitrate. I was concerned that ProRes's Quicktime container can't contain colour metadata and so maybe incorrect interpretation was due to that, but 10/12-bit DNxHR variants were just as bad:

ffmpeg -y -hide_banner -stats -i "Desktop\infile.mov" -c:v dnxhd -profile:v dnxhr_sq -vf scale=-2:720 -vf scale=in_range=full:out_range=full -vf format=yuv422p -c:a pcm_s16le -ar 48000 -movflags +write_colr "Desktop\outfile.mov"

Check bits per Raw Sample

Just to ensure the bit-depth was being set correctly I tested files with this command:

ffprobe -loglevel panic -show_entries stream=bits_per_raw_sample -select_streams v "videofile.mov"

Check Y Min Max

This checks the minimum and maximum luminance levels for YUV. This was when I realised real-world levels were different between files that otherwise had identical metadata.


ffprobe -f lavfi movie="Desktop/infile.mov",signalstats -show_entries frame_tags=lavfi.signalstats.YMAX,lavfi.signalstats.YMIN

Check Colour

I ran this on each file to check color_range, color_space and color_transfer were as expected.

ffprobe -v error -show_streams "Desktop/outfile.mov" | grep color_

Results

Variant Additional args Bit Depth Chroma SS Observed Colour Range Colour space, transfer Colour range Display Correct ?
Resolve DNxHR SQ None 8-bits 4:2:2 15-245 bt709 TV Yes
Ffmpeg ProRes 422 Proxy None 10-bits 4:2:2 4-1019 bt709 TV No
Ffmpeg DNxHR SQ None 8-bits 4:2:2 0-255 bt709 PC No
Adobe ProRes 422 Proxy None 10-bits 4:2:2 4-1019 bt709 TV Yes
Dan's Panasonic H.265 None 10-bits 4:2:0 0-1023 bt709 PC Yes
Dave's Panasonic H.265 None 10-bits 4:2:0 0-1023 bt709 TV No
Ffmpeg ProRes 422 Proxy (full, limited) in_range=full, out_range=limited 10-bits 4:2:2 5-1013 bt709 TV Yes

After adding that last entry to the table, I figured out the relationship ffmpeg video filter settings in_range and out_range have with color_range metadata reported. bit depth and chroma subsampling have no pull on levels interpretation and bt709 is consistent across the board. So I started another two tables:

Dan's Camera - colour_level = PC Variant In Range Out Range Display Correct? Group
Ffmpeg ProRes 422 Proxy Auto Auto No 1
Ffmpeg ProRes 422 Proxy Auto Limited No 1
Ffmpeg ProRes 422 Proxy Full Limited Yes 0
Ffmpeg ProRes 422 Proxy Full Full No 1
Ffmpeg ProRes 422 Proxy Limited Auto No 1
Ffmpeg ProRes 422 Proxy Limited Full No 2
Ffmpeg ProRes 422 Proxy Limited Limited No 1
Dave's Camera - colour_level = TV Variant In Range Out Range Display Correct? Group
Ffmpeg ProRes 422 Proxy Auto Auto No 1
Ffmpeg ProRes 422 Proxy Auto Limited No 1
Ffmpeg ProRes 422 Proxy Full Limited Yes 0
Ffmpeg ProRes 422 Proxy Full Full No 1
Ffmpeg ProRes 422 Proxy Limited Auto No 1
Ffmpeg ProRes 422 Proxy Limited Full No 2
Ffmpeg ProRes 422 Proxy Limited Limited No 1

I compared each shot with difference set as blending mode and overlaid them. Any interpretations that matched existing ones joined a group. Hence the group column. The only incorrect variant that didn't share the same interpretation as the majority was set to in_range=limited:out_range:full this compounded the inaccurate interpretation by squashing ranges even more.

Interestingly, setting levels to auto in these tests seemed to be equivalent to limited. Not sure of the logic behind ffmpeg determining levels?

Our only correct ffmpeg interpretation was in_range=full:out_range=limited. Settingin_range=limited for Dave's camera yielded matching, yet inaccurate results. After a little more testing I developed my proposal and started an implementation.

Proposal

If data levels are set to auto in Resolve, we ffprobe the file and find the color_range. If pc, set in_range=full. If tv, set in_range=limited.

If data levels are set specifically to full or video, we pull that from the clip and treat it like the color_range. If full, set in_range=full. If video, set in_range=limited.

Always set out_range=limited, unless codec preset specifies otherwise.

Conclusion

Confirmed data-level matching working in latest tests. Proxima now attempts to match Resolve's interpretation as closesly as possible, even if metadata is set incorrectly and Resolve's interpretation will be wrong. Incorrect interpretations can be fixed by setting data levels manually and re-encoding as needed to match new settings (just as clip-attribute level orientation settings work).