quietvoid / dovi_tool

dovi_tool is a CLI tool combining multiple utilities for working with Dolby Vision.
MIT License
607 stars 58 forks source link

Convert P8 to P7 MEL #44

Closed ZimuQin closed 3 years ago

ZimuQin commented 3 years ago

Hi! Is there any way to convert from Dolby Vision Profile 8 to Profile 7 MEL currently? I didn't find a way to do this.

quietvoid commented 3 years ago

There isn't a way. I'm not sure what purpose someone would have for this, profile 8 should be the same except the enhancement layer is not required.

ZimuQin commented 3 years ago

Thanks very much for your reply! I was just experimenting to make a dual-track P7 MEL mp4 file to see if it can play on my device using Dolby's mp4muxer. However, it seems like it was still recognized as P8. I don't know if this makes any sense to you.

quietvoid commented 3 years ago

Well if you used a profile 8 RPU then the flags for the presence of an enhancement layer are not enabled. Did you encode an enhancement layer yourself? It would have to be encoded the same as the base layer, which gets complicated since you have to make sure the frame types and order match.

ZimuQin commented 3 years ago

Well, I tried to encode EL, but maybe there's something wrong with the encoder parameters as you said. I just have one more question: Is there any data in the rpu file to state which profile it is?

quietvoid commented 3 years ago

If you use dovi_tool info it will show the profile. The RPU data doesn't indicate it, it's a combination of the flags to recognize the profile.

You can find more here: https://github.com/quietvoid/dovi_tool/blob/main/profiles.md

ZimuQin commented 3 years ago

Thank you so much. I'll keep experimenting your wonderful tool by myself.

ZimuQin commented 3 years ago

I managed to modify "disable_residual_flag" and "el_spatial_resampling_filter_flag" in the header, but when I use dovi_tool info, I got this error: "thread 'main' panicked at 'assertion failed: !reader.get()', src/dovi/rpu/vdr_dm_data.rs:172:17". So I guess some other parameters also needs to be altered as well. Apparently, it is much more complicated than I thought.

quietvoid commented 3 years ago

Yes, there is also the NLQ data to recreate.. When converting to P8: https://github.com/quietvoid/dovi_tool/blob/main/src/dovi/rpu/rpu_data.rs#L105 NLQ struct: https://github.com/quietvoid/dovi_tool/blob/main/src/dovi/rpu/vdr_rpu_data.rs#L24

There are profile 7 RPUs in the assets folder, maybe recreating from the MEL would work.

ZimuQin commented 3 years ago

Thank you! I'll keep experimenting.

quietvoid commented 3 years ago

Oh yea, some of the files in assets are old and not parsed properly by info. Need to prepend two bytes to the file and edit so that it starts by 00 00 00 01 19 08

ZimuQin commented 3 years ago

Following your advice, I've successfully recreate NLQ data from existing MEL to create a P7 rpu. However, the final file is not playable on my device. It shows no video, only audio, so I guess it is the incorrect encoding of the EL. (It may be too difficult for me to do so with x265) Again, thank you so much for your great tool!

One small question: I'm a little confused with the "active_area" parameter, because on my device(Oppo UDP-203), it doesn't seem it was properly used to crop the video. If "active_area" is not set on an uncropped video with black bars, would there be any artifacts on the black bar or the content area? Please tell me what is expected, as I actually didn't see anything wrong without the parameter on my device. Thank you!

quietvoid commented 3 years ago

For Blu-ray players, Dolby mentions that the active area metadata is ignored because the letterbox bars are used to display subtitles. Other players might show artifacts but only if there are non zero offsets and the video itself was cropped, as far as I know.

ZimuQin commented 3 years ago

That makes sense! Thank you for your patience.

ZimuQin commented 3 years ago

After a lot of playing with different parameters, I found that for P7 to work on my Oppo player, I have to set num_x_partitions_minus1 in the header to 2046 (I found the value in another MEL UHD Blu-ray) instead of the default value 0 for some reasons. If set to zero, the player would just show no video at all. I have no idea what was happening though.

quietvoid commented 2 years ago

Someone else asked for this so I implemented it in 5ca27765d8f6a5011bc9bd5ff9ab25251c10b9cc

ZimuQin commented 2 years ago

Great work as always!

shroomM2 commented 2 years ago

@ZimuQin I'm also trying to create my own MEL and play it on an Oppo 203 clone, but i'm getting corruption upon playback. Was there some special settings you had to set when encoding the MEL?

ZimuQin commented 2 years ago

@shroomM2 Are you trying to encode EL by yourself or just inject the rpu into the BL? I found encoding EL very tricky without the use of commercial solutions, because you need to make sure the frame types of each frame in BL and EL are the same. I did somehow make it work on Oppo 203, but the compatibility is in question. I think it makes more sense to just use P8 without EL.

shroomM2 commented 2 years ago

@ZimuQin I want to encode the EL by myself, yeah.

I don't have a device than can play profile8.1 with TrueHD ATMOS. Since I want to play this file on an Oppo 203 clone, P8 is (as far as i know) not supported.

My current source is a P8.1 in an MKV. With the help of quietvoid, I managed to track down the corruption - it happened because I used ffmpeg to extract the HEVC stream from this MKV. Now I used mkvextract and the decoding problems went away.

My current approach is:

When I encode the EL, I parse the BL, get the frame types, then generate a qpfile and feed it to x265 to encode a 1920x1080 all-black EL.

I actually produced a working PoC file, but need to do more testing using a full length content.

ZimuQin commented 2 years ago

@shroomM2 As far as I know, profile 8.1 in TS container is supported on Oppo 203. Just use tsmuxer to re-mux it to TS file would be fine. That's what I've been doing. BTW, my approach to encode EL is similar to yours, but the problem is that I can't get certain frames the same type as you stated in the qpfile. Maybe the way I got frame types was wrong. I used ffprobe. How do you get all the frame types?

shroomM2 commented 2 years ago

Oh, I did not know that about the Profile 8.1 support, will check it out, thanks!

My current approach for getting the frame types is using ffmpeg and the showinfo filter. The problem is that it's slow, I'm looking into optimizing this.

The CLI is something like...

 ffmpeg -i input.mkv -filter:v "showinfo" -f null - 2>&1 | grep iskey > showinfo.txt

Which gets me this in the file...

[Parsed_showinfo_0 @ 0x7695840] n:   0 pts:      0 pts_time:0       pos:     9694 fmt:yuv420p10le sar:1/1 s:3840x2160 i:P iskey:1 type:I checksum:8819896B plane_checksum:[CB24DAF0 444562BB C0084BB1] mean:[64 512 512] stdev:[0.0 0.0 0.0]
[Parsed_showinfo_0 @ 0x7695840] n:   1 pts:     42 pts_time:0.042   pos:   107238 fmt:yuv420p10le sar:1/1 s:3840x2160 i:P iskey:0 type:B checksum:4086779D plane_checksum:[BB33DAF0 050D50ED C0084BB1] mean:[64 512 512] stdev:[0.0 0.0 0.0]
[Parsed_showinfo_0 @ 0x7695840] n:   2 pts:     83 pts_time:0.083   pos:   106674 fmt:yuv420p10le sar:1/1 s:3840x2160 i:P iskey:0 type:B checksum:5E86779D plane_checksum:[D933DAF0 050D50ED C0084BB1] mean:[64 512 512] stdev:[0.0 0.0 0.0]
[Parsed_showinfo_0 @ 0x7695840] n:   3 pts:    125 pts_time:0.125   pos:   106099 fmt:yuv420p10le sar:1/1 s:3840x2160 i:P iskey:0 type:B checksum:4086779D plane_checksum:[BB33DAF0 050D50ED C0084BB1] mean:[64 512 512] stdev:[0.0 0.0 0.0]
[Parsed_showinfo_0 @ 0x7695840] n:   4 pts:    167 pts_time:0.167   pos:   107804 fmt:yuv420p10le sar:1/1 s:3840x2160 i:P iskey:0 type:B checksum:8D79876D plane_checksum:[D933DAF0 0F0D60BD C0084BB1] mean:[64 512 512] stdev:[0.0 0.0 0.0]
[Parsed_showinfo_0 @ 0x7695840] n:   5 pts:    209 pts_time:0.209   pos:   105521 fmt:yuv420p10le sar:1/1 s:3840x2160 i:P iskey:0 type:P checksum:A7D9896B plane_checksum:[CB24DAF0 640562BB C0084BB1] mean:[64 512 512] stdev:[0.0 0.0 0.0]
...

I parse this with a python script, but making sure I use b for B-frames as B represents a reference b-frame. So something like this...

0 I
1 b
2 b
3 b
4 b
5 P
6 b
7 b
8 b
9 b
10 P

It's possible my way is not OK either, since I only tested on ~5 minute files, will do more testing.

ZimuQin commented 2 years ago

@shroomM2 My way is similar. It did work on Oppo 203 but I doubt it would work on other machines. You can use Dolby Vision Verification Toolkit (https://developer.dolby.com/dolby-professional/professional-products/dolby-vision-verification-toolkit/) to check if your streams are valid. I got a lot of frame type mismatch with my file.

Travillion commented 2 years ago

@shroomM2 As far as I know, profile 8.1 in TS container is supported on Oppo 203. Just use tsmuxer to re-mux it to TS file would be fine. That's what I've been doing. BTW, my approach to encode EL is similar to yours, but the problem is that I can't get certain frames the same type as you stated in the qpfile. Maybe the way I got frame types was wrong. I used ffprobe. How do you get all the frame types?

I saw on another thread that ffprobe identifies multiple frame types as IDR frames, leading to many "false positives." Using the HEVC decoder through ffmpeg produces a more accurate frame identification, thus the method demonstrated by shroomM2 should work better than ffprobe.

ZimuQin commented 2 years ago

@Travillion ffmpeg also marks both IDR and CRA as type I keyframes, so it won't work as you expected. There is some very complicated way to get the correct frame types of all frames. However, the problem is that the BL stream was very likely encoded by some commercial encoder, and you are not able to create a compliant EL stream using X265 because of the encoder limitation. Thus, I finally gave up on this and now I am happy with profile 8.1 playback.

quietvoid commented 2 years ago

If you're brave enough you can also reencode with the benefits of the FEL. Using this: https://github.com/quietvoid/vs-nlq

Travillion commented 2 years ago

@Travillion ffmpeg also marks both IDR and CRA as type I keyframes, so it won't work as you expected. There is some very complicated way to get the correct frame types of all frames. However, the problem is that the BL stream was very likely encoded by some commercial encoder, and you are not able to create a compliant EL stream using X265 because of the encoder limitation. Thus, I finally gave up on this and now I am happy with profile 8.1 playback.

Thank you for the clarification on ffmpeg and I frames. Can you point me in the direction of the method to correctly identify frame types? I haven't been able to find any methods besides ffprobe or ffmpeg, which are unreliable.

Travillion commented 2 years ago

If you're brave enough you can also reencode with the benefits of the FEL. Using this: https://github.com/quietvoid/vs-nlq

Is there any documentation for this tool? I'm not sure if I'm brave enough, but I would like to look into it.

quietvoid commented 2 years ago

Is there any documentation for this tool? I'm not sure if I'm brave enough, but I would like to look into it.

No, it's meant to be used as a VapourSynth plugin. I don't provide support for it.

Travillion commented 2 years ago

I'll start with VapourSynth then. Thanks!

On Thu, Dec 9, 2021 at 9:15 AM quietvoid @.***> wrote:

Is there any documentation for this tool? I'm not sure if I'm brave enough, but I would like to look into it.

No, it's meant to be used as a VapourSynth plugin. I don't provide support for it.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/quietvoid/dovi_tool/issues/44#issuecomment-990000619, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIHG7UNINZNKVPWJJLSKMOLUQDI3BANCNFSM5DBEZJ3A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

quietvoid commented 2 years ago

It seemed as though all the polynomial and MMR data needed to be applied to get the BL into a state that was ready for EL application.

That is correct. The polynomial and MMR steps are done prior to NLQ, through libplacebo, in my case using vs-placebo. That isn't included in the vs-nlq example. NLQ processing is also preferably done in full range (0-255)

As for reencoding, it does indeed lose most of the extra detail but the brightness/chroma benefits from the FEL are still there.

Currently the vs-nlq code seems to result in a slightly different color temperature, which I will investigate later. A couple of people are implementing the ETSI document, discussions are mostly in libplacebo's IRC channel.

quietvoid commented 2 years ago

Not sure, I haven't looked into it yet. There are many steps I'm skipping currently: the documented EL resampling, chroma siting. Here's my last attempt comparison: https://slow.pics/c/Phs22yhT

All of these were encoded losslessly. The noticeably different part is the white dress, where the NLQ filtered image is different.

Also, it is likely that the final metadata should not include polynomial and MMR mapping, since it was already done by filtering.

ghost commented 2 years ago

I'm intrigued by what might be possible with the hardware HEVC Main12 4:2:2 and 4:4:4 that are supposed to be on Intel gen 12 processors. 4:2:2 is obviously great because the proper bitdepth combined BL+FEL stream of a 4k could be encoded and played back, something that everything I've tried just chokes to death on in software. Why the companies making $2000-6000 MSRP workstation GPUs (or even the $1000 MSRP desktop ones) haven't tacked this on to their feature set (and why Intel added a feature geared towards people who need a workstation processor that won't have an integrated GPU to its consumer line) 4:4:4 would be attractive for collapsing a 2K upscale UHD BL+FEL disk back down to the 2K resolution that it really is (while removing chroma subsampling altogether aside from the extra 2 bits from the FEL being 4:2:0).

My reasoning behind this is that both of my 4k TVs do a better job of upscaling 1080p HEVC than the studio did on a huge percentage of the 2k upscaled UHD releases I've seen, and that's from 4:2:0 subsampled sources. Some companies (cough Disney/Marvel cough) have some releases that look like somebody saw how long a high quality upscaling algo was going to take to transcode and switched to bicubic or bilinear then stacked 4 unsharp masks on it because their temporal-aware filter was also slow when they'd have been better off using the mythical 1:1 BL:FEL variant allowed on a UHD (but that I've never actually spotted) and letting user hardware handle the upscale...

Meanwhile $1000 MSRIP gpus still won't do anything except possibly MAIN10 4:4:4 in some models. I used to just hate chroma subsampling on general principle, since it should have been abandoned outside of broadcast situations that need it years ago (main 12 4:4:4 isn't really a massive size increase over regular HDR10 4:2:0, but seeing the HEVC mono profile called 4:0:0 subsampling really solidified that hate for me. Specifying that chroma channels don't exist should be as simple as calling it Mono, since there's an HEVC profile made for it, but the ridiculous notation had to go and make it even dumber sounding... :P

saindriches commented 2 years ago

using the mythical 1:1 BL:FEL variant allowed on a UHD

Did you mean FHD? Since 1:1 BL:EL is only allowed in FHD. For HEVC, only Main10 profile is used in current DV profiles, have you tried making a BL+FEL with 4:2:2 or 4:4:4 to playback on TV? It may cause problem.

HoffmannTom commented 2 years ago

Hello, I also have a BL+RPU file and want to convert it into a BluRay disc. How can I generate an EL / MEL ? Up to now I extracted the RPU and the BL. With the BL I can create a BD but without the HDR information. Unfortunately, I dont know the MPEG- and DV-Specification by heart. Thanks in advance!

quietvoid commented 2 years ago

There's no way to generate a MEL as of today.