Closed b-remy closed 2 years ago
Thanks Benjamin! If I look at the current version of the paper, can I see the changes in there?
Yes, the overleaf is updated with the changes described above.
Thankyou @b-remy! modulo the extra comments I left on slack, it looks good to me. Let me know when you are ready for me to do a final review!
Thanks @EiffL , I've made the changes according to your last comments.
Hi all, I'm still not sure about the new sentence:
even though the 2pt statistics beyond $10^4$ contains statistical information on the field, the signal-to-noise ratio is very low and therefore even a perfect method cannot be able to reconstruct localized features.
This feels a bit like weasel words. I think the fact that the mean of the samples for the Gaussian case does not match the theoretical expectation or the MAP estimate means there is a real problem here. A perfect method would indeed be able to have the correct statistics for the ensemble of Monte Carlo samples.
I'm happy to chat about this if you want?
Sure :-) do you have time this afternoon? I agree with you it's just a matter of clearly stating what our method can and cannot do at this stage. The fact that we don't match the mean posterior to small ell is actually the most exciting thing ML-wise here in this paper, because it highlights an issue that all score methods for inverse problems are struggling with at the moment
I'm free all afternoon - send me an invite for when you're free :)
Dear @EiffL and @NiallJeffrey ,
I've updated Fig. 5 with many more samples (1500) to compute the posterior mean, and the p.s. relative error with the Wiener filter is now zero on the whole range!
The sampling procedure used to compute Gaussian samples is the exact same as for the full posterior, i.e. annealed HMC reaching \sigma = 10^-3 and the projected on the posterior with the ODE sampler.
All the plots were also updated to match the A&A tex font and size.
I'm ready for another round of review :-)
Thanks a lot @b-remy this looks really great! I think it's pretty much ready to ship now.
Hi @b-remy - this looks great! Just to confirm, this result is achieved by going down to \sigma = 10^-3 for the Gaussian prior sampler? Is this also the value used for the hybrid-prior results with both simulations and COSMOS data in the rest of paper?
Also, do you have a single version of the bottom plot with different sigma values? It is important in the paper to explain that one has a trade off between sigma (i.e. run-time) and small-scale accuracy, that can be chosen for a given science case (for example, here the science goal is accuracy below \ell~1e4).
Hi @b-remy - this looks great! Just to confirm, this result is achieved by going down to \sigma = 10^-3 for the Gaussian prior sampler? Is this also the value used for the hybrid-prior results with both simulations and COSMOS data in the rest of paper?
Yes! As for the Hybrid prior sampler, for the Gaussian prior plotted above, we go down to \sigma = 10^-3. Exact same procedure and annealing for both priors.
Also, do you have a single version of the bottom plot with different sigma values? It is important in the paper to explain that one has a trade off between sigma (i.e. run-time) and small-scale accuracy, that can be chosen for a given science case (for example, here the science goal is accuracy below \ell~1e4).
I could do this, but eventually I did not need to go bellow the previous sigma (already 10^-3), giving the previous non-zero relative error. I just needed more sample to cancel the remaining noise in the average. Last time I thought I would need to play with the trade off, but turns out the chains were already long enough :-)
Do you have any other thoughts @NiallJeffrey ?
Ok, all good to merge for me
Hi @NiallJeffrey , @EiffL , Again, thank you very much for your feedback!
This pull request is to document the few modifications I made after our last meeting:
I've update Fig 6:
Update to Fig 7:
corrected the claim for \ell>10^4, now saying :
Update of fig 8: I changed the number of bins, and the x-range, this way the histograms are more visible and maybe even the bi-modality of the posterior for m=3e14
Update of fig 9: adding p(cluster|\gamma). I could figure how to write this probability conditioned also to the detection method, so I said it in the caption
probability of cluster detection given the input data and the detection method
.I couldn't understand why the power spectrum variance is bigger when using the full prior rather than the Gaussian prior. I noticed that sometime a few maps are weird. It is possible that in the beginning of the annealing, the step size is sometime too big, or that the evaluation of the prior at too high temperature is not well defined. Given this observation, I removed a problematic batch in the computation of Figure 6 power spectrum, reducing a bit the variance.
I remain open to any comment or change suggestion :-)