lu-zero / x264

My experiments on the x264 codebase
GNU General Public License v2.0
0 stars 7 forks source link

Make sure the altivec/vsx functions are faster than C #3

Open lu-zero opened 7 years ago

lu-zero commented 7 years ago

Some functions are quite slower than C according to checkasm.

sasshka commented 6 years ago

Benchmarked with checkasm --bench

malvanos commented 5 years ago

@sasshka What machine are you using?

In Power9 with gcc 7.3.0 Ubuntu 18.04 running on VM, I have slower performance on these functions:

quant_2x2_dc_c: 77 quant_2x2_dc_altivec: 85 satd_4x4_c: 128 satd_4x4_altivec: 136 intra_satd_x3_4x4_c: 439 intra_satd_x3_4x4_altivec: 439 zigzag_scan_8x8_frame_c: 125 zigzag_scan_8x8_frame_altivec: 129