Open ye-luo opened 2 years ago
Propose a weighted autocorrelation algorithm and we're in business.
Why auto-correlation is a prerequisite? I should be able to access the weighted average and standard deviation of the statistics and use the auto-correlation option at my will.
I would like to move to unbiased weighted statistics in a consistent way throughout qmca
Isn't it even worse now that everything is not weighted (biased) out of qmca?
Not really. The practical use cases (VMC, production DMC) have no or very little weight fluctuation and very little bias (like 1/100 of the statistical errorbar) when analyzed in practice.
There is large bias in the low walker, high weight fluctuation limit you are looking at, but this is not the practical case.
Therefore, I think we can afford to wait until we have a consistent solution for handling weights across all statistics we compute.
I agree this needs to be done, but there is some work we need to do first to find or derive unbiased equations for all quantities with weighted time series.
Another point. Technically, we will have reweighed VMC, so it is not fully immune.
We don't have working one now, but a fixed population/stochastic reconfiguration like DMC algorithm will be more likely to need the weighting. We'll also have to get used to plotting weight fluctuations and not population fluctuations.
Yes, we definitely need to account for weights. We just don't have the formulae to account for them yet, therefore this is the first task.
FYI. scalar.data only contain BlockWeight without population. The population is reflected as weight in the fluctuating population scheme. qmca just ignores that column.
Formulae for weighted mean and equal time covariance:
https://en.wikipedia.org/wiki/Weighted_arithmetic_mean#Weighted_sample_covariance
See "Weighted sample covariance" and "Frequency weights".
What is missing is autocorrelation time estimation with weighted time series. Probably a biased formula for this can be derived in the discrete (weighted delta function) limit from formulae for autocorrelation time from continuous distributions.
Deriving the analogous Bessel's correction (https://en.wikipedia.org/wiki/Bessel%27s_correction) for the autocorrelation time bias will take more thought.
I tried to understand how the auto-correlation is handled. In our manual, I only found how to use qmca but I cannot find how numbers are handled inside qmca. How the auto-correlation affects the error bar being reported. How is the equilibrium is determined. The manual may not be the best place to record such info, qmca source or help can be better places but we do need to document those options defaulted as auto by qmca.
See "Computing autocorrelation times" here: https://dfm.io/posts/autocorr/
This is basically what we do in qmca and elsewhere.
See also Slide 8 in https://github.com/QMCPACK/qmc_workshop_2021/blob/master/week3_stats_and_nexus/week3_stats_nexus_vfinal.pdf
The truncation (expressed in the variable M) is done when the sample autocorrelation function drops below zero.
qmca only does
(-2.1497437216-2.1533011570)/2 = -2.1515224393
but to me a weighted average is more appropriate.I expect
(-2.1497437216 * 32 -2.1533011570 * 12)/(32+12) = -2.1507139312