Closed martin-ueding closed 6 years ago
I have implemented this now in the extra branch. The tests pass, but on a 4^4
lattice this does not mean so much as the phase factors are just 1
, i
, -1
and -i
. This should be tested on a larger lattice and performance measured.
I have implemented a unit test using a very odd lattice size (14 × 26 × 34) and comparing the old and new version with each other. The phase factors agree up to 1e-14
. This likely is due to rounding errors in the multiplications with the hopping phase factor.
Performance has improved by a factor 20:
-------------------------------------------------------------
Benchmark Time CPU Iterations
-------------------------------------------------------------
BM_create_momenta_old 3204653 ns 3198890 ns 219
BM_create_momenta_new 169172 ns 168807 ns 4173
Do we want to use the new version or is rounding error from the consecutive multiplications a problem as we go to larger and larger lattices?
So we go from 3*10^(-3) seconds to something which likely can't be measured :) I am indeed somewhat worried about the rounding. Is the rounding error that you quote the maximal deviation across the whole lattice?
The whole thing gets called for every momentum combination, so I would think that we spend a few minutes in this part of the code. Certainly not the bottleneck.
I have tried a few different methods, but as soon as you use a complex multiplication you are off by 1e-15
in the real and imaginary parts. Therefore I will just let it be. As a side effect we now have a restructured build and a benchmark framework :smile:.
Okay, I think the benefit of having the benchmark framework is a very welcome side-effect!
Branch: phase-factor
[x] Do some correctness test with a larger lattice.
The momentum phase factor in the VdaggerV calls the
exp
function for each site on the lattice. Similar to the FFT one might be able to rewriteexp(-i p x) = exp(-i p_x)^x
and then only call theexp
function once. To go through the lattice, one just to multiply the phase factor with the cached value.The
exp
function seems to use around 50 cycles (source).