carstenbauer / MonteCarlo.jl

Classical and quantum Monte Carlo simulations in Julia
https://carstenbauer.github.io/MonteCarlo.jl/dev/
Other
185 stars 18 forks source link

Crosscheck with Paper, complex matrix optimizations #137

Closed ffreyer closed 2 years ago

ffreyer commented 2 years ago

This pr will add a crosscheck/example comparing to https://arxiv.org/pdf/1912.08848.pdf.

TODO

General Changes:

Complex Linear Algebra

The model from the reference paper uses complex hoppings. To be able to run it efficiently this pr implements all the neglected complex linear algebra. It also cleans up BlockDiagonal so that it works with ComplexF64 and can be combined with complex StructArrays. Closes #78

Lattice Iterator revamp/changes

The reference paper calculates the current current correlations with unsynchronized directions. However it's not just every bond up to the K-th farthest. It's every bond that has a hopping associated to it without reversed bonds (these are explicitly included in the formula pre Wicks) and without bonds with dir[bond][1] == 0. This gets rid of quite a few bonds. With out current iterators and NN and NNN bonds, for example, we'd include 9 directions (on-site, 4x NN, 4x NNN) but only really need 3 (+x NN, 2 +x NNN). And because it'S usnyhcronized the number of directions goes in squared, so we do 9²/3² = 9 times the work.

To fix this I've revamped the lattice iterator system here. The main goal was to allow passing directional indices directly, e.g. EachLocalQuadByDistance([2, 5, 9]). I also wanted to stop relying on only one of each lattice iterator type being created at the start of the simulation to memory usage down. So I made the following changes:

There is now a LatticeIteratorCache attached to DQMC. Perhaps later I will put this together with the lattice. The cache contains a dict that gets filled with maps such as dir_idx -> (src, trg) which are accessible via cache[Dir2SrcTrg()] and can be created via push!(cache, Dir2SrcTrg(), lattice). Every lattice iterator will create the maps it needs (no duplicated) and holds references to it for iteration. So now the difference between 1 and 1000 EachLocalQuadByDistance iterators is just 3000 integers and 3000 references.

The second change I made was that every iterator now has a template version and a runtime version. The template is light, used as a field in DQMCMeasurements and can be saved without extra care. The runtime version (same type name with _ in front) follows from the template and will construct the maps it needs during creation.

ffreyer commented 2 years ago

https://carstenbauer.github.io/MonteCarlo.jl/dev/examples/triangular_Hubbard/ clearly does not \Delta^\dagger \Delta + \Delta |delta^\dagger for its pairing correlation. It fits way worse than \Delta \Delta^\dagger. Needs to be adjusted if I go through with switching the default from pc_kernel to pc_combined_kernel. The reference for this PR clearly does use the combined kernel though...

ffreyer commented 2 years ago

I benchmarked the new model with StructArrays - thermalization is twice as fast (roughly 0.5s -> 0.25s in my test), but measurements are just as slow (2s -> 2s, occupation and pairing susceptibility).

I see two potential reasons for this:

  1. The real and imaginary parts of the Greens function are not indexed in separate loops, which is not cache friendly.
  2. Lattice iterators themselves are probably not cache friendly. They're essentially semi-random lists of indices. But given that they are static per simulation there is probably some optimization potential here as well...
ffreyer commented 2 years ago

Making use of StructArrays in measurements is tough atm. There is no left vs right multiplier like in mul!, matrix elements could technically be multiplied by i or conjugates could be taken. So instead I'm just converting complex StructArrays back to plain complex matrices. This isn't optimal, but still a lot better... (~2s -> ~1.33s)