lisamelton / other_video_transcoding

Other tools to transcode videos.
MIT License
540 stars 24 forks source link

Wrong coloration using videotoolbox with Dolby Vision #203

Open zcutlip opened 3 months ago

zcutlip commented 3 months ago

I'm having a heck of a problem with certain types of input that have Dolby Vision, specifically profile 8.1.

I don't think this is a bug with other-transcode, but more likely either ffmpeg, or videotoolbox. I apologize if this forum isn't appropriate, but I'm hoping we have enough expertise here that someone has some insights.

When I transcode the video, which is UHD and has DV profile 8.1, and I use --vt, the reds are washed out and appear orange. Additionally, many scenes flicker between a washed-out orange and a really-washed-out yellowish.

If instead do software encoding using --x265, things look as expected (to my untrained eye).

I have other dolby vision source material using other DV profiles that doesn't appear to have any problems (again, to my untrained eye).

I've reproduced the problem using some of Dolby's sample videos, such as this one: https://media.developer.dolby.com/DolbyVision_Atmos/mp4/P81_GlassBlowing2_3840x2160%4059.94fps_15200kbps_fmp4.mp4

If I use ffprobe to compare the video stream of the hardware- and software-encoded outputs, here's the diff:

diff --git a/glassblowing_stream0_hardware.json b/glassblowing_stream0_software.json
index 1ca8e15..4c4a950 100644
--- a/glassblowing_stream0_hardware.json
+++ b/glassblowing_stream0_software.json
@@ -12,7 +12,7 @@
   "coded_height": 2160,
   "closed_captions": 0,
   "film_grain": 0,
-  "has_b_frames": 0,
+  "has_b_frames": 2,
   "sample_aspect_ratio": "1:1",
   "display_aspect_ratio": "16:9",
   "pix_fmt": "yuv420p10le",
@@ -21,7 +21,7 @@
   "color_space": "bt2020nc",
   "color_transfer": "smpte2084",
   "color_primaries": "bt2020",
-  "chroma_location": "left",
+  "chroma_location": "topleft",
   "field_order": "progressive",
   "refs": 1,
   "r_frame_rate": "19001/317",
@@ -29,7 +29,7 @@
   "time_base": "1/1000",
   "start_pts": 50,
   "start_time": "0.050000",
-  "extradata_size": 112,
+  "extradata_size": 2552,
   "disposition": {
     "default": 1,
     "dub": 0,
@@ -53,13 +53,13 @@
   "tags": {
     "HANDLER_NAME": "Bento4 Video Handler",
     "VENDOR_ID": "[0][0][0][0]",
-    "ENCODER": "Lavc60.31.102 hevc_videotoolbox",
-    "BPS": "8132415",
+    "ENCODER": "Lavc60.31.102 libx265",
+    "BPS": "8099559",
     "DURATION": "00:02:55.008683333",
     "NUMBER_OF_FRAMES": "10490",
-    "NUMBER_OF_BYTES": "177904713",
+    "NUMBER_OF_BYTES": "177185965",
     "_STATISTICS_WRITING_APP": "mkvpropedit v83.0 ('Circle Of Friends') 64-bit",
-    "_STATISTICS_WRITING_DATE_UTC": "2024-03-13 17:21:17",
+    "_STATISTICS_WRITING_DATE_UTC": "2024-03-13 17:35:30",
     "_STATISTICS_TAGS": "BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES"
   },
   "side_data_list": [

Here's ffprobe's report of the input video stream:

{
  "index": 0,
  "codec_name": "hevc",
  "codec_long_name": "H.265 / HEVC (High Efficiency Video Coding)",
  "profile": "Main 10",
  "codec_type": "video",
  "codec_tag_string": "hev1",
  "codec_tag": "0x31766568",
  "width": 3840,
  "height": 2160,
  "coded_width": 3840,
  "coded_height": 2160,
  "closed_captions": 0,
  "film_grain": 0,
  "has_b_frames": 3,
  "sample_aspect_ratio": "1:1",
  "display_aspect_ratio": "16:9",
  "pix_fmt": "yuv420p10le",
  "level": 153,
  "color_range": "tv",
  "color_space": "bt2020nc",
  "color_transfer": "smpte2084",
  "color_primaries": "bt2020",
  "chroma_location": "topleft",
  "refs": 1,
  "id": "0x1",
  "r_frame_rate": "60000/1001",
  "avg_frame_rate": "60000/1001",
  "time_base": "1/60000",
  "start_pts": 3003,
  "start_time": "0.050050",
  "duration_ts": 10500490,
  "duration": "175.008167",
  "bit_rate": "15138826",
  "extradata_size": 23,
  "disposition": {
    "default": 1,
    "dub": 0,
    "original": 0,
    "comment": 0,
    "lyrics": 0,
    "karaoke": 0,
    "forced": 0,
    "hearing_impaired": 0,
    "visual_impaired": 0,
    "clean_effects": 0,
    "attached_pic": 0,
    "timed_thumbnails": 0,
    "non_diegetic": 0,
    "captions": 0,
    "descriptions": 0,
    "metadata": 0,
    "dependent": 0,
    "still_image": 0
  },
  "tags": {
    "language": "und",
    "handler_name": "Bento4 Video Handler",
    "vendor_id": "[0][0][0][0]",
    "encoder": "HEVC Coding"
  },
  "side_data_list": [
    {
      "side_data_type": "DOVI configuration record",
      "dv_version_major": 1,
      "dv_version_minor": 0,
      "dv_profile": 8,
      "dv_level": 9,
      "rpu_present_flag": 1,
      "el_present_flag": 0,
      "bl_present_flag": 1,
      "dv_bl_signal_compatibility_id": 1
    }
  ]
}

Like I said, I feel like this is probably a bug in either ffmpeg or videotoolbox itself, but I'm hoping someone has some insights here and possibly ideas for workarounds

ffmpeg version: 6.1.1 other-transcode version: 0.12.0 macOS version: ProductName: macOS ProductVersion: 14.2 BuildVersion: 23C64

loshlee commented 3 months ago

I wonder whether your ffmpeg version is too old or whether you aren't using the latest version (12 or "0.12.0") of other_video_transcoding. Anyway, I ran the following on your linked sample (with name shortened to "GlassBlowing.mp4"). The output looked splendid and the file size was reduced from 345 to 142 megabytes.

other-transcode --mp4 --target 4000 ~/Downloads/GlassBlowing.mp4

zcutlip commented 3 months ago

Oh sorry, I should have put my version information earlier in the post. Here it is again: ffmpeg version: 6.1.1 other-transcode version: 0.12.0 macOS version: ProductName: macOS ProductVersion: 14.2 BuildVersion: 23C64

Did you encode it with videotoolbox, using the --vt option?

Here's the command I'm using:

other-transcode --crop auto --debug --vt --hevc --10-bit glassblowing.mp4

edit: If I use x265 with the following command, it looks great

other-transcode --crop auto  --debug --x265 --10-bit glassblowing.mp4
zcutlip commented 3 months ago

I wonder whether your ffmpeg version is too old or whether you aren't using the latest version (12 or "0.12.0") of other_video_transcoding. Anyway, I ran the following on your linked sample (with name shortened to "GlassBlowing.mp4"). The output looked splendid and the file size was reduced from 345 to 142 megabytes.

other-transcode --mp4 --target 4000 ~/Downloads/GlassBlowing.mp4

Ah, I see what happened in your case. The above command uses h264_videotoolbox and resizes down to 1080p. So no wonder it looks good for you. Unfortunately this means the output isn't UHD and has no HDR or dolby vision

loshlee commented 3 months ago

I was thinking that you wanted .mp4, but I must have imagined that. When I use mpv (installed using --cask option for Homebrew) for playback, the mkv output of your "--crop auto --debug --vt --hevc --10-bit" other-transcode command looks just like the .mp4 original. But the latest version of IINA omm just displays a black window, but does display a minimal preview in the transport window when I manipulate the slider, which strikes me as very odd. Usually IINA and mpv can play the kitchen sink, but IINA gives me the same black window for the mp4 original, although it does not show the preview in the transport window. Sorry I couldn't provide more help. Maybe Sean or Lisa or someone else can.

samhutchins commented 3 months ago

To the best of my knowledge, ffmpeg doesn't support passing through the Dolby Vision metadata with videotoolbox. :-(

zcutlip commented 3 months ago

To the best of my knowledge, ffmpeg doesn't support passing through the Dolby Vision metadata with videotoolbox. :-(

I wonder if this documented somewhere that you know of (ffmepg's documentation can be hit or miss)? The reason I ask is that it raises even more questions

As far as I know videotoolbox has handled Dolby Vision just fine with my profile 7 content. I'm only encountering issues with profile 8.1

If true, I'm curious if it's a shortcoming in ffmpeg or in videotoolbox itself. And if the issue ffmepg, I wonder if there's an issue tracking it.

For now I can stick with x265. I'm fortunate enough to have a 2019 Intel Mac Pro but even with 32 threads, it only transcodes at around 0.45x for UHD :-(