xiph / rav1e

The fastest and safest AV1 encoder.
BSD 2-Clause "Simplified" License
3.73k stars 252 forks source link

Encoder statistics recoding for bottom-up partition search is not feasible #2342

Open ycho opened 4 years ago

ycho commented 4 years ago

First of all, I think you the author for encoder statistics recording feature, which is turned on "-v". I found that, without additional save/restore code for mode decisions, it is impossible to gather statistics for coding modes for bottomup partition search. Because, the mode decision is done post-encoding, then currrnet rav1e only rewinds the entropy coder and neighboring contexts after best mode is chosen, and the coding modes are not restored since it is already written to bitstream. So, "-v" option with speed 0 or 1 (i.e. bottomup partition), it records the encoding statistics of RDO as well.

Different from rav1e, libaom does re-encode from the roor of SuperBlock, so collecting the statistics of final coding mode is easily feasible.

ycho commented 4 years ago

FYI, I was able to catch this case coincidentally, when my patch with bug run at speed 0 and the -v option says lots of 4x4s or 8x8s while the bitstream rarely have it as checked with bitstream analyzer.

shssoichiro commented 4 years ago

Yeah, I noticed this was a possibility when implementing this. I didn't have proof that it was wrong, but speculated based on the code. However, I left it in because I still assumed it to be feasible and fixable later.

Question is what's the best thing to do with it? Disable stats recording if we are doing bottom-up encoding?

ycho commented 4 years ago

Yeah, I noticed this was a possibility when implementing this. I didn't have proof that it was wrong, but speculated based on the code. However, I left it in because I still assumed it to be feasible and fixable later.

Question is what's the best thing to do with it? Disable stats recording if we are doing bottom-up encoding?

Hey- thanks for replying this quick. Yes, I think simply not support the feature if partition search method is bottomup. Unless rav1e has libaom's huge tree style storage to remember every partition decision in a SB, I think there is no way to pick the final decisions, i.e. partition decisions. That information (partition decisions) is only available from decoder!

shssoichiro commented 4 years ago

That information (partition decisions) is only available from decoder!

This makes me think that it might even make sense to remove the statistics entirely from rav1e, and instead add a feature to aomanalyzer (or something) to display encode-wide statistics on decode. Maybe aomanalyzer isn't the right tool since currently it only decodes a few frames at a time. But something at decode-time.

ycho commented 4 years ago

Hi, so I didn't really want to discourage and drop all these helpful and supporting features, because the statistics collection in encoder side is in fact very crucial tool to enhance the video encoders. Before saying those now popular machine intelligence schemes, please recall that how those hundreds of probabilities (CDF, Cumulative Density Functions) are acquired for AV1, and all past standards like H.264 (I remember there was more than 400 CDFs). They are all collected from encoder side statistics! So, I don't really discourage the idea which has just begun. And I strongly believe we can devise some working method!

ycho commented 4 years ago

If you give me some time (like days) and hold it, I can peruse the encoder stat collection code in top-down partition search carefully and if it looks big task to fix it or seems infeasible as in bottom-up.