xiph / rav1e

The fastest and safest AV1 encoder.
BSD 2-Clause "Simplified" License
3.73k stars 252 forks source link

Full 4:2:2/4:4:4 support #970

Open rzumer opened 5 years ago

rzumer commented 5 years ago

4:2:2 and 4:4:4 chroma sampling are supported only in a degraded configuration. The following features should be fixed and re-enabled once bugs are resolved:

ycho commented 5 years ago

I think these are tasks 'TODO', not bugs? Also, the terminology 'restoration' in video or image compression/processing generally means restoring from lost signal, or even create new info that does not exist (in super-resolution realm). I believe you meant 'reconstruction' by 'restoration'.

rzumer commented 5 years ago

@ycho: Features are disabled because of bugs when enabling 4:2:2/4:4:4, hence the label. If unimplemented portions for 4:2:2/4:4:4 are known, then they should be documented and panic with unimplemented!(), but until then there are only guesses and hints that I gathered, and it is not obvious what is left to do and where in order to complete the support, so I consider it a group of bugs.

I added the "enhancement" tag because it does not affect usage with 4:2:0 and they (4:2:2/4:4:4) are still functional or supposed to be functional in the degraded state, so it depends on your perspective.

As you can see in the encoder/sequence parameters, restoration is disabled when not using 4:2:0.

ycho commented 5 years ago

Well, yes- it depends on perspective. I see that "// FIXME: inter unsupported with 4:2:2 and 4:4:4 chroma sampling" is from you, so it is the bug that you desire to be fixed (I mean, by someone who is able). :) I thought 422 and 444 should not work since I have never agreed on enabling 422 and 444, but there are already bunch of code and even testers are there already! Regarding 'restoration', yes, I think I misread at 2nd use (i.e. restoration (enabling causes decoding failure") of that term in to-do. But in "inter frames (enabling causes faulty restoration output... ", I doubt what you meant.

rzumer commented 5 years ago

I see that "// FIXME: inter unsupported with 4:2:2 and 4:4:4 chroma sampling" is from you, so it is the bug that you desire to be fixed (I mean, by someone who is able). :)

I don't think it is a high priority so I am not expecting someone else to look at it; the comment is a note for anyone who may look at the code. There are likely multiple things missing to enable inter frames with non-4:2:0 sampling. In my commit history you can see I fixed a few issues related to 4:2:2/4:4:4, but I have not had time to look at it since then to fix the remaining ones, including inter frames (even intra-only does not work), restoration and disabled prediction modes.

I thought 422 and 444 should not work since I have never agreed on enabling 422 and 444, but there are already bunch of code and even testers are there already!

Do you mean that you would prefer to disable 4:2:2/4:4:4 completely until they are fully supported? They were enabled in this degraded configuration back in February.

Regarding 'restoration', yes, I think I misread at 2nd use (i.e. restoration (enabling causes decoding failure") of that term in to-do. But in "inter frames (enabling causes faulty restoration output... ", I doubt what you meant.

Oops, you are right, that one was a typo.

ycho commented 5 years ago

Do you mean that you would prefer to disable 4:2:2/4:4:4 completely until they are fully supported? They were enabled in this degraded configuration back in February.

No, I mean I assumed both non 420 chroma format path would not work correctly. Whether the necessary but not done parts are marked as to-do or a bug, I am fine to leave current working path. I believe they will be eventually work. :)

rzumer commented 5 years ago

Regarding the inter frame issue I have tried disabling prediction modes and partition types, but the first desync consistently occurs for nyan.y4m converted to 4:4:4 on the right border of the 3rd superblock row (272x160 or 272x128 block offset).

Encoded/decoded block sizes vary based on what I disable but here is a sample diff at this location:

< 32 32
---
> 8 32
> 8 32
> 8 32
> 8 32

The decoding output detects rectangle partitions even when there are/should be none.

It further desyncs a few blocks later and the diff is quite large.

Decoding error occurs about halfway through (in terms of number of blocks).

rzumer commented 5 years ago

As mentioned before by @tdaede there was a missing decimation factor scaling operation in write_tx_tree(), which is used in inter coding. After adding that operation, motion_compensate() was causing the encoder to panic due to negative offsets of transform blocks that seem to be hardcoded for 4:2:0. I do not have much background on that, so will read the specification to try to fix it correctly. Naively removing the offsets (for 4:4:4) solves the panics, but the decoding failure remains, so there may be other areas that need to be corrected.

ycho commented 5 years ago

I believe this is happening with speed 0, right? As I mentioned, quick & temporarily workaround to support 422 and 444 would be not using sub 8x8 partition sizes, so that it avoids "if p > 0 && bsize < BlockSize::BLOCK_8X8 {" condition block in motion_compensate().

rzumer commented 5 years ago

@ycho All my recent tests use -s 10 and disable every possible coding tool, as far as I am aware, except the most basic. I also tried breaking down blocks to 4x4 only but rolled back due to other issues (like you mentioned, sub-8x8 partitions have many issues for inter mode non-4:2:0) and went the other way (i.e. 64x64 and 32x32 on edges).

rzumer commented 5 years ago

Tested encoding nyan.y4m sampled in 4:4:4 and processed with ffmpeg using -vf "scale=NxN" where N is 64, 128, 192 or 256.

At 64, 128 and 256, the output is decoded correctly and there is no desync with a single inter frame (and almost all coding tools disabled). At 192, the second frame is not decodable, as usual.

dav1d can manage to decode the file (albeit with desyncs). Attached is the encoded file and decoded output. nyan192.tar.gz

barrbrain commented 4 years ago

Now only sub-8x8 inter blocks are still excluded. This along with the tiling accounting issue in #2212 mean that non-4:2:0 subsamplings are not yet on par with 4:2:0