fraunhoferhhi / vvdec

VVdeC, the Fraunhofer Versatile Video Decoder
https://www.hhi.fraunhofer.de/en/departments/vca/technologies-and-solutions/h266-vvc.html
BSD 3-Clause Clear License
454 stars 91 forks source link

Xrayleigh2000 develop #78

Closed xrayleigh2000 closed 2 years ago

xrayleigh2000 commented 2 years ago

Modify point: 1.Optimize the calculation of the filter parameters of the deblock 2.The parameters of ALF are changed from struct of array to array of struct to improve data access 3.CC-ALF of the two chroma components is performed simultaneously (if possible), so that the calculation results of luma can be reused

Test: 1.The conformance test is performed using the command line “make test-all”, and all tests pass; 2.Tested the decoding time of single thread and multi thread. For the 0/2/10/20 thread configuration, the average decoding time is 97.86%/97.37%/98.85%/99.23% as before;

adamjw24 commented 2 years ago

Nice work. Looks good to me for the most part.

A few comments:

  1. Could you rabase onto current master? We just pushed our current internal state so that the merge can go smoothly.
  2. Could you put the CtuAlfData into CtuData and remove Picture::m_ctuAlfData? The structure is memset to '0' at the beginning of slice parsing (CodingStructure::initStructData), so no need to have a constructor. All of the resize logic can be removed too.
  3. Rename readAlfCtuFilterIndex2 back to readAlfCtuFilterIndex. Or is there a reasoning behind the name change?
  4. Could you elaborate a bit on what you did in the LoopFilter? Like 1-2 sentences on which cases are you processing differently and how.
xrayleigh2000 commented 2 years ago

Hi, adamjw24: I merged your pull request into my branch, which caused the commit to be a bit messy. Did I need to sort out and merge the commit record?

In addition, the optimization of loopfilter is as follows:

  1. Move the conditional statement outside the loop to reduce computation inside the loop(ct == end && deriveBdStrngt;start != end)
  2. Advance the judgment of “ch == CHANNEL_TYPE_LUMA” to reduce some calculations of chroma
  3. Reuse bs value of TU boundary to reduce the computation of bs
adamjw24 commented 2 years ago

It would be great if you could cleanup the commits a bit.

xrayleigh2000 commented 2 years ago

I'll clean up the commits and submit them later

adamjw24 commented 2 years ago

Good stuff. The LoopFilter really has an impact. Thanks. Merging.