Optimize things further

FALCONN-LIB / FFHT

Fast Fast Hadamard Transform

http://falconn-lib.org

Other

77 stars 19 forks source link

Optimize things further #11

Closed ilyaraz closed 7 years ago

ilyaraz commented 7 years ago

I wrote a new version of FFHT [1], which is quite a bit faster and does not require the array to be aligned. Here are the results for floats with AVX (doubles are not yet supported).

log_2 n	new code (s)	old code (s)
7	3.4598541260e-08	5.22722e-08
10	3.1082916260e-07	4.71749e-07
15	1.3809814453e-05	2.20371e-05
20	6.7438203125e-04	0.00113861
27	2.0852150000e-01	0.398605
30	2.0832067000e+00	4.0913

[1] https://github.com/ilyaraz/ffht_codegen

ludwigschmidt commented 7 years ago

Very nice! Then we can get rid of the FHTHelper as well (https://github.com/FALCONN-LIB/FALCONN/blob/master/src/include/falconn/core/polytope_hash.h#L80).

To what extent is the code generated now? Can it do AVX-512 potentially for the new Core i-9 CPUs? :-)

ilyaraz commented 7 years ago

Yes, you simply need to re-implement the first several functions in gen.py. The same applied to SSE and to doubles.

ilyaraz commented 7 years ago

Done in v1.1.