FALCONN-LIB / FFHT

Fast Fast Hadamard Transform
http://falconn-lib.org
Other
77 stars 19 forks source link

Optimize things further #11

Closed ilyaraz closed 7 years ago

ilyaraz commented 7 years ago

I wrote a new version of FFHT [1], which is quite a bit faster and does not require the array to be aligned. Here are the results for floats with AVX (doubles are not yet supported).

log_2 n new code (s) old code (s)
7 3.4598541260e-08 5.22722e-08
10 3.1082916260e-07 4.71749e-07
15 1.3809814453e-05 2.20371e-05
20 6.7438203125e-04 0.00113861
27 2.0852150000e-01 0.398605
30 2.0832067000e+00 4.0913

[1] https://github.com/ilyaraz/ffht_codegen

ludwigschmidt commented 7 years ago

Very nice! Then we can get rid of the FHTHelper as well (https://github.com/FALCONN-LIB/FALCONN/blob/master/src/include/falconn/core/polytope_hash.h#L80).

To what extent is the code generated now? Can it do AVX-512 potentially for the new Core i-9 CPUs? :-)

ilyaraz commented 7 years ago

Yes, you simply need to re-implement the first several functions in gen.py. The same applied to SSE and to doubles.

ilyaraz commented 7 years ago

Done in v1.1.