cd-athena / VCA

Video complexity analyzer
GNU General Public License v3.0
77 stars 9 forks source link

Add SIMD assember functions. #8

Closed ChristianFeldmann closed 2 years ago

ChristianFeldmann commented 2 years ago

I got the asm files to compile using:

nasm -f win64 -DARCH_X86_64=1 -DHIGH_BIT_DEPTH=0 -DBIT_DEPTH=8 dct8.asm

Now we have to make this work with cmake.

ChristianFeldmann commented 2 years ago

I think everything is added. We will have to see about compilation on other systems but at least on windows using msbuild it works for me so far. I ran some numbers and holy moly this thing is fast: image

ChristianFeldmann commented 2 years ago

So this seems to work. However, I don't think it would directly work with 16 bit transforms too. But maybe we don't ever want to do that anyways and just always do a 8 bit DCT because the additional bits don't give us more precision.