Overview and outline - Githubissues

miguelfmorales commented 5 years ago

The goal of this project is to develop a systematic error budget for 21 cm cosmology power spectrum observations.

The motivation is that all efforts to measure the PS are currently systematic limited. And while we have learned a lot about various sources of systematic errors, no one has put all of this work together into a cohesive view of the known systematic errors.

In this repo I'd like to gather our thoughts on all of the different systematic errors—putting the known/published sources of error into context and discussing the known but unpublished sources of error. I think we finally have enough results and intellectual scaffolding to do this effectively.

Eventually I'd like this to lead to a paper and a systematic error budget that can drive the design of future 21 cm cosmology instruments and their associated data analysis pipelines.

miguelfmorales commented 5 years ago

After my talk at URSI, Joe put together the following beginning of a requirements tree.

I like this a lot, but I tend to think bottom-up, so I'd like to put together various sources of systematic errors (each as a separate issue) and eventually hook them together into a tree like this.

miguelfmorales commented 5 years ago

One of the key starting points is how we define how to calculate the systematic terms.

Traditionally you think of a table of systematic terms, each with an associated uncertainty which needs to add up to less than the value of the signal you expect to measure.

But implicitly each of those terms is an integral over a region, and each experiment will choose how to perform those integrals differently. That is fine, but we need to be very clear how we are setting up the problem so that we can both create a meaningful error budget and create a scaffold for other to create budgets with different instruments/analyses that integrate over different regions. Or said another way our goal is to delineate the key sources of systematic error (the rows in the table) and how to calculate the associated uncertainties. Choosing the regions (and thus the system requirements) is part of the instrument design process—some instruments may choose to go deep on a very few modes, while others may integrate more shallowly over a much wider range—our goal is to show how to do that.

miguelfmorales commented 5 years ago

My proposal for the above is to talk about this in the 3D k space, using the kpar vs. kperp space for most of the diagrams (3D plots suck). Here I'll outline what I'm thinking.

The intrinsic foreground plus cosmology signal is expect to look like:

where the spectrally smooth foregrounds completely dominate (by ~10 orders of magnitude in mK^2) at the lowest kpar modes. These modes cannot, even in principle, ever be observed. In this space the signal from the EoR is expected to be spherically symmetric and fall with increasing k (it rises if you put it in dimensionless Delta units, but the measurement is much closer to this space where it falls with |k|). This puts a lot of pressure on small |k| in the lower left of the plot. This is both where the signal is strongest and where arrays tend to be the most sensitive (see #2).

miguelfmorales commented 5 years ago

The development of the window/wedge paradigm further restricts the range in which measurements can be made. For imaging analyses uncertainties and errors in the image reconstruction (formally the sky estimator not being equal to the true sky) leads to power being thrown from the bright intrinsic foreground up into a wedge like shape:

The 'good' news is that there is still some space to the lower left that should have low contamination. The challenge is making this so—the topic of this whole exercise.

But for this current topic, which is just how do we even define what area to integrate over to get the systematic uncertainty terms, we need to dive in a bit deeper and talk about the parts.

miguelfmorales commented 5 years ago

First, what is the integral we are doing? Each measurement of a 1D |k| mode is done by averaging the individual k measurements in a 3D sphere. But the radius, width, and extent of the integral is chosen by the experimenter. I've tried to show this schematically here (really done in 3D space, but idea in 2D space):

But for each proposed measurement, we have the integration region, so we can integrate the associated systematic errors to determine the systematic uncertainty associated with each measured |k| mode. So if we know the 3D shape of each systematic, we can perform the associated integral to determine the systematic error contribution.

Each |k| measurement becomes a table (could be 1 table, could be many if a whole spectrum is measured). Each row is then the integral of a systematic error contribution of the corresponding interval, and we hope the sum of the rows is less than the expected signal in that |k| bin.

miguelfmorales commented 5 years ago

So in certain ways, our task in this project is to quantitatively determine the fingerprint of different systematic errors in the 3D k space. Figuring these out will depend on experiment and analysis details, but here we should be able to list the key effects (the rows), and how to calculate the associated fingerprints. Easy, right!?

miguelfmorales commented 5 years ago

One of the things I think we can leverage is the conceptual work in https://ui.adsabs.harvard.edu/#abs/2019MNRAS.483.2207M/abstract

One of the confusions for some time has been the huge diversity of analysis approaches. This paper groups all of the analyses into two types: measured (delay) PS and reconstructed (imaging) PS

While it is a little unfortunate that we still have 2 types (they can be related using https://ui.adsabs.harvard.edu/#abs/2014PhRvD..90b3018L/abstract ), for many systematics like the accuracy of the bandpass calibration the fingerprint is the same. We just have to be clear when they are the same or different, and how.

The other thing to immediately notice is that the accessible region is quite different. Delay style analyses will always be limited to the window. Working in the wedge will be very hard, but is possible in principle with an imaging analysis if the imaging reconstruction is of sufficient quality. And this is necessary for some science cases such as the BAO of CHIME—they will pick regions of integration that extend down into the wedge by necessity, while we know apriori that delay analyses never will.

miguelfmorales commented 5 years ago

Now starting a list of areas we need to fill out (some admixture of rows in the final table and items in Joe's chart), along with discussion tabs as new issues.

Sensitivity #2: picks up Joe's sensitivity box, but as baseline distribution and observing strategy are both important contributions this would not concentrate on antenna number and diameter, but instead the uvf coverage.

Estimator accuracy #11: this includes a whole tree of effects associated with data calibration. Basic grouping is how to make the estimate of the visibilities (measured PS) or the sky (reconstructed PS) as close to the true quantity as possible.

Astrophysical Foregrounds #12: includes all of the various known foregrounds and their spectral characteristics.

Ionospheric Effects #17: not sure to put this as a top level item, but it is often seen as a shim between the true sky and the sky as observed with the instrument.

miguelfmorales / SysErrorBudget

Overview and outline #1