fraunhoferhhi / vvenc

VVenC, the Fraunhofer Versatile Video Encoder
https://www.hhi.fraunhofer.de/en/departments/vca/technologies-and-solutions/h266-vvc.html
BSD 3-Clause Clear License
965 stars 173 forks source link

Rate control has problems with still picture encoding #93

Closed rafael2k closed 3 years ago

rafael2k commented 3 years ago

When encoding still images at low bitrate (80kbit per pic, SD resolution) - all good happens if the image has low entropy, and the picture matches the size specified by "-b" (-r 1). If the image is of high entropy, the encoder chooses a very bad qp (49 for eg), and instead of 10 kbytes, the encoded picture is 3 or 4 kbytes (and obviously has an awful quality).

Also I don't really know the correct syntax for "--profile", but I always get an error when I try to use it: Error parsing option "profile" with argument "main10". or Error parsing option "profile" with argument "main10_stillpic"

adamjw24 commented 3 years ago

The rate control algorithms were not designed for single image coding. The QP adaptation is done on a per-image basis, and not per-CTU, which would be more suited for this use-case. Did you try the 2-pass rate-control? It has a somehow more stable QP selection model.

Either way, we'll have a look into this.

About the profile issue. Please try with an underscore between main and 10, i.e. "main_10".

rafael2k commented 3 years ago

Thanks Adam. Yes, I tried the 2-pass rc - no luck. Is there a way to change this behavior of the qp adaptation?

adamjw24 commented 3 years ago

No, we removed the old code. It seems there is no quick fix right now. We will include still pic RC into the agenda, but I wouldn't expect results too soon.

For now if you feel like doing a workaround, maybe try doing this in the application level? You could do a few encodings in faster preset to find the QP. Than with the final QP encode in your desired preset (should mostly improve quality, and only reduce the rate slightly). Should not be too much runtime overhead if you work with medium or slow(er) for the final encode.

rafael2k commented 3 years ago

For now I'll do this workaround and run the encoder in a iterative way to find the QP which gives me the final size I need. Tks! Btw, the reference implementation has this RC logic which works better for still pic implemented?

adamjw24 commented 3 years ago

About the reference SW, I think so, but not sure. Either way, the VTM RC is bad and should not be used, and I think you're still faster overall with rate screening using the faster preset and final encoding in your chosen setting.

The only reason to use VTM right now would be for the full 444 support.

rafael2k commented 3 years ago

Thanks Adam. We'll use yours VVC implementation. Still images in 420 looks fine for very low bitrate, no problem. Today I'll commit the iterative bash encoding script implementation here: https://github.com/DigitalHERMES/uucomp/blob/main/uucomp/compress_image.sh

crhelmrich commented 3 years ago

Hi Rafael,

I'm one of the developers of the 2-pass rate control algorithm and currently testing an improvement for still image coding. You wrote that (emphasis mine) "If the image is of high entropy, the encoder chooses a very bad qp (49 for eg), and instead of 10 kbytes, the encoded picture is 3 or 4 kbytes (and obviously has an awful quality)." Would it be possible for you to share that input image so I can evaluate the behavior of my modifications in that particular scenario?

Thanks,

Christian

rafael2k commented 3 years ago

Dear Christian, sure. A 4:2:0 yuv: http://www.abradig.org.br/vvc/image-840x628.yuv vvcenc line used: vvencapp -i image-840x628.yuv --profile main_10_still_picture --qpa 1 -t 2 -r 1 -b 80000 -s 840x628 --preset medium -c yuv420 -o out.vvc encoded output size is: 3195 bytes (QP 49)

crhelmrich commented 3 years ago

Thanks very much! This image is truly an extreme case and very hard to encode with only 80 kbit (10 kbytes). What supports this observation is that, as an example, encoding this image in AVIF format in Gimp 2.10 at the same bit-rate requires setting the "Quality" slider to 4 out of 100, i.e. almost at the low end of the quality scale.

Anyway, I can confirm that the current rate control algorithm greatly underestimates the bit-rate requirements on some low-rate encoded still images. For the performance of the current VVenC development version towards release 1.2, see the attached picture, top right (QP 50, 2.5 kbytes). The bottom row shows you an estimate (one of two possible variants) of how VVenC 1.2 will perform (QP 45, roughly 9.5 kbytes) with the following two changes:

  1. a modification of the rate control algorithm for still image coding, which we will publish here on GitHub as soon as possible,
  2. running the encoder twice. With the modifications from 1., the resulting bit-rate increases from 2.5 to roughly 7.1-7.4 kbytes, which is still a bit off due to the fact that the rate control is primarily intended for video coding. However, you can run VVenC a second time in 2-pass rate control mode, with the -b 80000 changed to 80000 80000/(8 byte-size of first encoding result), i.e., -b 108108 for a first output size of 7.4 kbytes. That seems to result in quite accurate rate matching with the modified rate control algorithm in my own experiments (it does not, however, seem to work as well with the existing VVenC 1.1 behavior).

image_840x628_issue93

I'll notify you here once the relevant update to the VVenC source code has been made available.

Christian

rafael2k commented 3 years ago

The mentioned improvements you are writing are wonderful Chris. I'll re-run my tests as soon as you commit them. I'm still not sure how to run the encoder the second time, but as soon as you commit the changes I'll adapt my script to match the changes. My plan is to use constant bitrate as a "first pass" encoding, and if QP is too high, I'll allow for a higher file size with a smaller QP - starting from the constant bitrate QP. Cheers!

rafael2k commented 3 years ago

Wonderful! One question: how to use the two-pass mode with statistics file? It is just a matter of setting passes 1 and 2 and set the same statistics file to the same path, correct?

adamjw24 commented 3 years ago

Exactly. Of course you have to set passes to 2.

adamjw24 commented 3 years ago

Or rather. You have to set passes to 2, and for each run you set pass to first 1, and than 2. Both calls need the rc stats file path set.

rafael2k commented 3 years ago

Hi Christian, New rate control works much better. Setting the same bitrate for first and second pass using very low bitrate settings, I get usually the bitrate in the target, or up to 70% more in bitrate, but usually not less, which is way better than before. If I use a slower preset, can the accuracy of the rate control logic improve?

adamjw24 commented 3 years ago

Reopening, since some issues still exist

rafael2k commented 3 years ago

Thanks Adam. I updated my still image encoding script to use the new rate control logic now: https://github.com/DigitalHERMES/uucomp/blob/main/scripts/compress_image.sh I'll publish all my corpus and tests with it soon.

adamjw24 commented 3 years ago

Great, looking forward to the results.

As for now, we'd appreciate any additional info you could give us (problematic source images, exact commands etc.).

crhelmrich commented 3 years ago

New rate control works much better. Setting the same bitrate for first and second pass using very low bitrate settings, I get usually the bitrate in the target, or up to 70% more in bitrate, but usually not less, which is way better than before. If I use a slower preset, can the accuracy of the rate control logic improve?

In my own experiments (using 4:2:0 YUV versions of the Kodak test set from http://r0k.us/graphics/kodak/ and step 2 in my post above, https://github.com/fraunhoferhhi/vvenc/issues/93#issuecomment-919320342), choosing different speed presets had very little effect on the overall rate matching performance. I got good results with the default medium preset.

Christian

rafael2k commented 3 years ago

Christian, I tried applying the method you suggested, but the first pass outputs no file, I get "no frames encoded". I got it working running a first pass without no "--pass", and then a second pass with "--pass 2", and indeed, the final file size is pretty accurate!

rafael2k commented 3 years ago

With some inputs I still get a 30%+ error to the rate set. I put my small target material here: http://164.41.155.66/rafael/hermes_picture_tests/images/

crhelmrich commented 3 years ago

Well, a discrepancy of around 30% for some still images coded with VVenC is still possible and, in my view, acceptable. After all, the ideal approach is "roughly constant quality", which is exactly what you get by using single-pass encoding with a certain QP of your choice and activated perceptual optimization (QPA).

For clarification: above I proposed the following approach, where you - if you get a significant rate mismatch - call the simple encoder app twice on the input image (here, kodim01_768x512_BT709_P420.yuv), with the same output file name (here, kodim01.266):

  1. Call vvencapp.exe -s 768x512 -b 123456 -f 1 -r 1 -i d:\Kodak\kodim01_768x512_BT709_P420.yuv -o kodim01.266
  2. Get the byte-size of the resulting output file kodim01.266, multiply it by 8, and save the result e.g. as FS.
  3. If FS is at least 10% larger or 10% smaller than your target rate (here, 123456 bits), then execute step 4:
  4. Call vvencapp.exe -s 768x512 -b TARGET -f 1 -r 1 -i d:\Kodak\kodim01_768x512_BT709_P420.yuv -o kodim01.266

where TARGET equals the square of the initial target rate (here, 123456) divided by FS.

For example, if in step 2 above you got a value for FS which is 25% larger than your target rate (here, 154320), then TARGET should equal 98765.

Please let me know if this works for you, so we can close this issue.

Christian

rafael2k commented 3 years ago

Dear Christian, indeed, 30% more or less of the target is good enough, I agree. I implemented your heuristic, but at lower bitrates, the difference between one QP unit is a lot, it does not work very well for the purpose also of getting a better image quality at fixed bitrate. I decided to add a one-pass adaptive resolution adjustment logic in order to achieve a minimum QP (in my case I set 40) for a given constant bitrate, downscaling the image to get a higher QP (line 123 on at: https://github.com/DigitalHERMES/uucomp/blob/main/scripts/compress_image.sh ). Results with reconstructed material are here: http://164.41.155.66/rafael/hermes_picture_tests/reconstructed-vvc-10kqp/

I think this issue (I would not call a problem) will only be solved with adaptive QP for each CTU. Please go ahead closing this issue, and thanks a lot for all the insights and understanding about your VVC implementation!

crhelmrich commented 3 years ago

at lower bitrates, the difference between one QP unit is a lot, it does not work very well for the purpose also of getting a better image quality at fixed bitrate. I decided to add a one-pass adaptive resolution adjustment logic in order to achieve a minimum QP (in my case I set 40) for a given constant bitrate, downscaling the image to get a higher QP

That sounds like a very good idea to me (but I assume you meant "to get a lower rate" in your statement above) and, as you say, probably is the best image quality-vs.-rate tradeoff. Since you seem to have something working now, I'll close this issue.

Thanks for reporting, especially of the http://164.41.155.66/rafael/hermes_picture_tests/ results!

Christian

rafael2k commented 3 years ago

Thank you Christian!