HISKP-LQCD / sLapH-contractions

Stochastic LapH contraction program
GNU General Public License v3.0
3 stars 3 forks source link

Phase factor in momentum uses exp too often #60

Closed martin-ueding closed 6 years ago

martin-ueding commented 6 years ago

Branch: phase-factor

The momentum phase factor in the VdaggerV calls the exp function for each site on the lattice. Similar to the FFT one might be able to rewrite exp(-i p x) = exp(-i p_x)^x and then only call the exp function once. To go through the lattice, one just to multiply the phase factor with the cached value.

The exp function seems to use around 50 cycles (source).

martin-ueding commented 6 years ago

I have implemented this now in the extra branch. The tests pass, but on a 4^4 lattice this does not mean so much as the phase factors are just 1, i, -1 and -i. This should be tested on a larger lattice and performance measured.

martin-ueding commented 6 years ago

I have implemented a unit test using a very odd lattice size (14 × 26 × 34) and comparing the old and new version with each other. The phase factors agree up to 1e-14. This likely is due to rounding errors in the multiplications with the hopping phase factor.

Performance has improved by a factor 20:

-------------------------------------------------------------
Benchmark                      Time           CPU Iterations
-------------------------------------------------------------
BM_create_momenta_old    3204653 ns    3198890 ns        219
BM_create_momenta_new     169172 ns     168807 ns       4173

Do we want to use the new version or is rounding error from the consecutive multiplications a problem as we go to larger and larger lattices?

kostrzewa commented 6 years ago

So we go from 3*10^(-3) seconds to something which likely can't be measured :) I am indeed somewhat worried about the rounding. Is the rounding error that you quote the maximal deviation across the whole lattice?

martin-ueding commented 6 years ago

The whole thing gets called for every momentum combination, so I would think that we spend a few minutes in this part of the code. Certainly not the bottleneck.

I have tried a few different methods, but as soon as you use a complex multiplication you are off by 1e-15 in the real and imaginary parts. Therefore I will just let it be. As a side effect we now have a restructured build and a benchmark framework :smile:.

kostrzewa commented 6 years ago

Okay, I think the benefit of having the benchmark framework is a very welcome side-effect!