Open Arcadia197 opened 2 years ago
With new tests I was directly able to link the hermite interpolation to time increase. When increasing only the map update epsilon, the footpoints are computed further away and may start to utilize other hermite points for interpolation. After computations I am able to confirm this, as I experienced a quite severe time increase. This somehow links the speed of the hermite interpolation to the map update epsilon and the grid size of the velocity
I thought about this now for a while. If the CFL number is smaller than 1, we know that for each time step we only need ((16+2) * N_p / N_c)^2 points for the advection of one block, as the new solution does not travel further. This could be then really sped up with loading all values of a block beforehand and reusing them, which should end up in a MASSIVE speed boost.
However:
Thoughts I had to this until now:
For now, the strided access in the hermite interpolation is one of the main bottlenecks of the code performance. Some clever ideas have to be tested so that this will gain some improvement.
Possible ways could be: Have another look at the organization of the files to reorganize psi so that all data of one point are next to each other. This was already tested once and lead to no improvement, however, the reads were not made uniform, so maybe this could be tested to only lead to 2 consecutive reads.
However, the warping structure of the dipheomorphism makes it hard to predict at the boundaries where to read the data. Separating the cases in boundary- and non-boundary-cases however will lead to divergent code.