Closed MariusSchiffer closed 3 years ago
I just noticed one major problem with this implementation: the cached pairs potentially overlap, i.e. there will be multiple pairs with the same ispin
and/or jspin
.
This means that when data is assigned to the energy or energy gradient of a spin, a race condition appears and the data may become incorrect. I am quite surprised that the unit tests did not catch this...
The optimal fix for this would be to have one list per spin, so that one could parallelize over all spins, where each can access its own list of neighbours.
Another problem is the potential memory consumption, which would require there to be an off-switch for the case of large system sizes (> 100^3).
I have to give this some thought, as I am not sure how to combine these different aspects properly and without duplication of code.
As the bugs were never adressed, I will close this PR now
First time contributor checklist
Contributor checklist
develop
branchDescription
This PR creates a cache for DMI/Exchange indices, resulting in a speedup of roughly 3x. Tests and sanity checks imply that it is correct.
Missing: Caches also have to be rebuilt when editing the pairfields. CUDA implementation. Further verification for correctness, especially for non-OpenMP builds.