If we are spending compute to run multiple sampling chain anyway, perhaps we can run each chain at different $\beta$ so that we can linear regress $\mathbb{E}_w^\beta nL_n(w)$ against $1 / \beta$ to obtain a better learning coefficient estimates.
Disadvantage:
we lose the ability to use across chain variance to check stability of sampling.
we run the risk of each chain sampling a different portion / direction even if they start from a common center $w^*$ and thus making the regression invalid. This need to be tested in toy potentials and more realistic settings.
Advantage:
The $R^2$ value of the linear regression is itself a diagnostic tool too.
We don't need to rely on $w^*$ itself being the local minimum of the averaged potential $L(w)$, just close enough to one.
Better cancellation of lower order terms.
There is another way of doing it without actually doing the sampling at multiple temperatures by using a formula to convert samples from one temperatures to samples from another temperature. There is a little bit of numerical issue to be worked out (some log-sum-exp or similar trick). But this should get rid of the different chain exploring different direction issue.
More context: