Alphaharrius / Zipper.jl

Implementation of Zipper Entanglement Renormalization on Julia platform.
GNU General Public License v3.0
1 stars 0 forks source link

Slow compute due to large `unitcellfock` in the `CrystalFock` when construct `fourier` #20

Closed samwongapple closed 8 months ago

samwongapple commented 8 months ago

Situation

When doing GMERA, enlarging the local region would increase the size of unitcellfock. The time for constructing Fourier transform is unexpectedly long.

Cause

I suspect it is due to the flattening in the SparseFock.

Alphaharrius commented 8 months ago

The problem is related to the line Iterators.product(crystalhomefock|>enumerate, regionfock|>enumerate) since enumerate requires to iterate through the iterator and the implementation of iterate for SparseFock is iterate on the Subset{Mode} generated from orderedmodes(::SparseFock), which requires a flattening of the representation Subset{Subset{Mode}} of the SparseFock. This is proven to be very slow to compute for large FockSpace which adds up when doing product.

Prove

Performed an isolation of the flatten by calling orderedmodes before adding into product as argument, now the process have speed up by huge margin in which the preparation timer doesn't even show up.

Proposed fix

The root cause is the structure of SparseFock which uses Subset{Subset{Mode}} and requires flattening every time we need to access its Mode collection, thus it is better to swap out SparseFock to a new type which uses Subset{Mode} only.

Alphaharrius commented 8 months ago

In addition, it will be better if the underlying implementation of Subset can be changed since OrderedSet is not very performant in operations of slicing, union and intersect, and not very memory efficient since it is intrinsically a OrderedDict. We can simplify it by introduce a much simpler structure consisting only a Tuple and Dict.

Alphaharrius commented 8 months ago

This fix includes a more computational efficient Subset and a new FockSpace implementation named NormalFock using the new Subset as its representation. Test shows the slowness issue described by @samwongapple is solved from the original 45s to <1s since that counting timer does not even appear. This test also enhanced the performance of the overall codebase which test case for cherninsulator.jl with system size of 96x96 can execute the 1st renormalization in 22s comparing to the previous ~60s on a test computer of MacBook Pro M3 Max 12-core.

Alphaharrius commented 8 months ago

This issue is fixed, waiting for @samwongapple to verify the pull request.