Closed lu-zero closed 7 years ago
All the functions being used by the current codebase are implemented. The code can be made faster leveraging the fact 8x8 and larger blocks could be unrolled to use all the registers.
The code is already on gerrit pending review.
Code accepted.
The filter bank can be used as is and loaded to vectors on demand, should be possible to convert it to an array of vectors to avoid a round trip later.
Some functions of the family can have specific implementations adding some constraints on blocksize and filter offsets. Some can be efficiently implemented only for some blocksizes.