Open miguelfmorales opened 5 years ago
What becomes subtle is how to interpret and communicate the results, particularly in the (B) regions. Let me detail three correct ways of interpreting the results, then I'll come back and comment on what is currently in the paper.
Present the measured values and uncertainties in region (A). This is where the EoR PS was actually constrained, so presenting these values is fairly straightforward. (A few subtleties as explored below, but in (A) they are of mostly academic concern.)
Constrain the 21 cm fast parameters. Most easily done in Monte Carlo, but may be analytic solutions too. Qualitatively this takes the constraints, dominated by region (A), and asks what 21 cm fast models are consistent with these results.
Use 21 cm fast MCMC to find the extrapolations into (B) that are allowed by the data, and then report the limits at k values within region (B). This explicitly uses the prior of 21 cm fast-like EoR PS, and will get the correct uncertainties given the priors in region (B). It is important to note that the errors from this will be quite a bit larger than currently indicated. Because you are effectively constraining in (A) then extrapolating into (B), with sophisticated model priors, the uncertainties in region (B) are naturally larger than in region (A) due to both the distance from the primary area of constraint and the various shapes that are allowed by your PS priors.
Any of these approaches are fine.
In the current version of the paper, the errors are calculated as if all the bins are independent. While this is probably fine in region (A), it is incorrect for the EoR PS line in region (B).
Almost all of the constraint information for the EoR PS values in region (B) come from measurements in region (A). Said another way, if I was to pick a EoR PS value in region (B), and ask what was the weights from all of the other bins used in determining that value (the window function), the contribution would be strongly dominated by measurements in region (A). This means that the noise on these measurements is not local (thermal at the location), but is instead the extrapolated noise from region (A). This is particularly a strong effect when the results are quoted in \Delta^2, as the noise in region (A) is much larger in those units.
This neatly circles us back to Solution 3 above. When done properly, injected MC signals will give you the same answer as directly propagated errors. We tend to fall back on using MC error propagation when the analysis becomes too complicated to for direct propagation.
I think what I and others have been asking for is some type of MC injection calculation to verify that the errors are correct.
You have been extremely helpful and patient Abhik, and I now understand your process much better. But your observation that injecting signals at single EoR modes in the (B) regions won't work in your framework really leads us to the solutions outlined above.
Further, I'm confident that the local errors currently calculated are not correct, because the values measured in the (B) regions are dominated by non-local constraints. To quote limits in the (B) regions, I think the only way forward is Solution 3 above.
Thanks giannibernardi and abhik. To answer you question Gianni, it depends on what you want to say in the paper.
I want to carefully separate the work, which in general I think is excellent, from the interpretation of the results which is where things become subtle. If you wanted to talk about measurements in region (A), I don’t see any significant block to publishing.
However, if you want to say this is an effective way of removing foregrounds and recovering measurements in region (B), then I don’t think the current analysis is sufficient. In particular, if I understand correctly you have a non-local analysis (the measurement at any mode depends strongly on measurements from other modes) and are using a local error calculation. To have accurate uncertainties, you really need to calculate the errors in the same non-local way as the analysis.
And I suspect this is not of just academic interest. I suspect that non-local error calculations will show substantively higher variations in the recovered signal in the (B) regions. And this would change your claims on the efficacy of foreground removal.
So whether this is in the scope of this paper depends on what you want to claim.
If you want to present this as foreground removal that can recover the modes in region (B), then sadly I think analysis along the lines of Solution 3 is required.
If Solution 3 is too much additional work, there might be other framings we should discuss. Just to propose one possible framing, you could talk about how data constraints in region (A) can infer what the range of results must be in region (B) given the theory priors of 21 cm fast/GPR. This is a theory driven view, and uses some of the work in your memo Abhik. And while this sounds like a small change, statistically asking “what can I say about region (B) given measurements in region (A) and a theory prior?” and asking “what measurements can I make in region (B) given instrumental foregrounds and priors on the shapes of the signal and foreground?” are very different. The first talks about using theory to predict what will be measured in region (B), while the other actually creates new measurements in region (B).
You mentioned the wrong Abhik. I'll unsubscribe. :)
Ha! A name collision with slack
Writing out a draft of my response to Abhik https://eoranalysis.slack.com/archives/CELPT55K7/p1555359988025600
Hi Abhik, thank you for this, I think this is the key information. Let me follow with a slightly long but detailed response.
I agree, and I think this is the crux. You are using the 21-cm fast simulations of smooth EoR PS to extrapolate what the value must be in areas that are foreground dominated. I think this is a fun approach and clever, but as you point out:
The corollary is that you can't inject EoR power at one scale because that violates the prior of a smooth EoR PS.
Let me restate the same thing in pictures, because I think it will clarify where my concerns do enter. In your Figure 12, you have PS shapes that have been separated by GPR. But if we concentrate on just the EoR line, we can identify two areas.
GPR slide 1.pdf
In regions (A) the EoR signal dominates, and in regions (B) the foreground and instrumental effects dominate. Your approach is using your knowledge of how 21-cm fast PS behave to extrapolate the EoR signal from the high k (A) regions into the (B) regions.
Again this is all well and good. I have no concerns with this process.