DLR creation memory usage is linear in $\beta$ (should be $\log\beta$)

HugoStrand commented 1 year ago

Description

This is a reposting of the triqs issue: TRIQS/triqs#917 showing that the memory usage building the DLR basis is linear in the inverse temperature $\beta$, preventing the use of DLR at low temperatures. This is probably caused by the current approach to select DLR Matsubara frequencies, since a dense Matsubara grid is used with an upper cutoff proportional to $\beta$.

See the memory scaling test here: TRIQS/triqs#917

Expected behavior: The peak memory usage should be $\log \beta$ for the DLR basis to be applicable in the low temperature regime.

Actual behavior: Peak memory usage is linear in $\beta$

jasonkaye commented 1 year ago

Please see my response in the TRIQS issue report. I'm going to label this as a feature request here, not a bug report.

jasonkaye commented 1 year ago

Since this is really a cppdlr issue, let's discuss here. I am quoting your response in the corresponding TRIQS issue below:

Dear Jason,

I agree that there are many ways to handle this.

I think that since we have an upper bound on how many imaginary frequency points the DLR construction will select (given by the epsilon rank r of the real-frequency rank revealing QR) it is possible to construct a dyadic grid that terminates at the order Λ which is the current imaginary frequency cutoff. I have experimented with this in pydlr and it is working out well.

The idea is to put out panels on the positive Matsubara frequency indices where the first panel contains all r first Matsubara indices, the second panel contains every second index still keeping r points, the third panel takes every 4th index up to r points retained, etc. The number of panels is controlled by the upper cutoff Nmax=ceil(Λ) just as we are currently doing it. (Finally the index panels are also used for the negative frequency indices.)

Here is a code example that constructs the range of indices with panels with doubled spacing for each panel index when the flag dense_imfreq=False:
        if dense_imfreq:
            n = np.arange(-nmax, nmax+1)
        else:
            # -- Sparse starting grid in imaginary frequency
            r = self.rank
            n_panels = int(np.ceil(np.log2(nmax/r))) + 1 if nmax > r else 2                
            n = np.zeros(r * n_panels)

            idx = 1
            for p in range(n_panels):
                d_idx = r * 2**p
                nn = np.arange(idx, idx + d_idx, 2**p)
                n[r*p:r*(p+1)] = nn
                idx += d_idx

            n = np.concatenate((1-n[::-1], n))
and here is the commit that implements this in pydlr jasonkaye/libdlr@ff40d13

Can I go on and implement this in cppdlr?

Best, Hugo

This is basically my suggestion (3) in my previous reply, and if it appears to be working, I am all in favor of implementing it, with some caveats. Given that this is a heuristic, and to avoid a proliferation of flags, for the time being I would try to do this by modifying the current code to give the user the option to pass in either an nmax, yielding the same behavior as now, or their own specified fine grid. This would first need to be an alternative constructor to imfreq_ops. Then, one would also need an alternative version of build_k_if. And then one would need to modify some more code in the imfreq_ops constructor slightly (extracting imaginary frequencies from pivots). This way, we maintain the current API, but also give the user the option to pass in a fine imaginary frequency grid of their choosing. Furthermore, you can make a function which generates your particular fine grid. Then we can decide later on if we want to make this more of the "default" in one way or another.

Also, it would be great if you could include a reasonably convincing test that this works as well as as the full fine imaginary frequency grid (e.g., giving just as small interpolation error for a few examples).

If that sounds reasonable to you and you want to go ahead and implement it, perhaps you can do it as a pull request and then we can look it over.

Thanks,

Jason

mwallerb commented 9 months ago

Hey guys, just stumbled over this issue, hope it is okay to comment "across the aisle":

I agree with @jasonkaye and @HugoStrand, I think strategy (3) should work nicely. For the IR basis, we found that you have considerable leeway in how you exactly choose the sampling frequencies without affecting the conditioning of the problem too much, as long as you make sure that the lowest couple of Matsubara frequencies are in your set. This is why we are fine with choosing the discrete sign changes of the highest-order basis function ...

In general, we find that the conditioning of the fitting problem in tau-space in 0.5 \sqrt\Lambda, while in Matsubara, you get 2 \sqrt\Lambda if you do it optimally, and you should still be within 4 \sqrt\Lambda or so if you don't exactly nail your frequency points.

As for strategies (1) and (2), what one finds empirically for many bases, for example the Fourier-transformed Chebyshev polynomials, is that for the lowest sampling frequencies, the "true" sign changes on the imaginary axis are extremely close to the Matsubara frequencies. So that is something you could think about. This might also explain why choosing Matsubara frequencies works reasonably well.

HugoStrand commented 9 months ago

Hey @mwallerb!

Thank you for the input! I'll have a closer look at the prefactors on the coordination numbers of the transforms.

Regarding the sign changes in the Matsubara frequency representation of the Chebyshev polynomials, we saw the same trend when extending the Matsubara sparse sampling to Legendre polynomials, see Appendix B in https://doi.org/10.1103/PhysRevB.106.125153.

Cheers, Hugo

mwallerb commented 9 months ago

Thanks a lot, @HugoStrand - I missed this appendix in your paper!

flatironinstitute / cppdlr

DLR creation memory usage is linear in $\beta$ (should be $\log\beta$) #3

Description