wmorning / EvilLens

Simulating images and visibilities of gravitationally lensed galaxies, behind Yashar Hezaveh's back.
GNU General Public License v2.0
4 stars 1 forks source link

simulated data needs to include realistic observational effects #41

Closed dmarrone closed 5 years ago

dmarrone commented 9 years ago

Hi Warren! Phil's here visiting today, and we are chatting about the ALMA data, and potential problems with it.

There are three effects I think we need to worry about, amplitude errors, phase gain errors and decoherence. The first two are antenna-based effects, and the phase errors have previously been dealt with in the Cycle 0 lens fitting. The decoherence is a baseline-based corruption and will suppress the visibility amplitude on the longest baselines. This is effectively reducing the high-spatial frequency structure in the maps, precisely the signature of DM substructure. We should probably simulate this in the data and see what it does.

It seems likely that we can add this effect to uncorrupted simulations of ALMA data, because CASA probably doesn't include these effects.

Can you assign this issue to me, and I will get back to you with a suitable reference for the atmospheric structure function, so that you can simulate it in a realistic manner?

Dan

dmarrone commented 9 years ago

Hi Warren,

A good starting point for understanding the physical effect I'm talking about is here: http://adsabs.harvard.edu/abs/1999RaSc...34..817C See sections 4 and 5.

There are two regimes of interest for the simulations. One is the atmospheric phase scatter that will be introduced on timescales longer than single integrations. In effect, each antenna gets a randomly varying phase added to it and this phase is greater for more distant antennas. If you pick a reference spot in the center of the array, you can use the phase structure function to calculate the RMS phase over each antenna, and use that to generate the random phases. A slightly more sophisticated treatment would incorporate timescales for these variations.

The second regime is the decorrelation caused by phase varations that are faster than an integration. This affects longer baselines more, because the phase variations are larger across longer baselines, and this is the most pernicious effect. It should be simulated separately from the effect described above. I expect that we'll want to reduce the amplitude of the visibilities by a factors that are derived (eq 4 in the paper) from the phase structure function (figure 6, for an example). We can test the importance of such effects on our fitting, and if it looks important, get more information from ALMA about the magnitude of the observed effects.

Dan

wmorning commented 9 years ago

Thanks Dan! I'll start reading right away.

wmorning commented 9 years ago

Ok, after reading the paper and chatting with Yashar yesterday, I think I have a handle on how to add Amplitude and phase errors as well as decoherence. Since two of these effects are antenna dependent, it seems logical to add them to the data after we've used CASA to turn the lensed image we've created into a measurement set. The question then becomes how to properly handle the data such that we keep everything in its original format (except with the visibilities now being the corrupted ones).

My goal is to engineer this such that we have a class which reads data from measurement sets, and can call functions like add_phase_errors(self) in an ipython notebook to give itself phase errors, amplitude errors, and decoherence. We should also look into how to write the outputs (to be as utilitarian as possible). @drphilmarshall , maybe if you have time early next week we could meet and discuss?

A question about adding phase errors though: From what I understand, to add a phase error to an antenna, one just multiplies all visibilities that the antenna in question was involved in measuring by a factor exp{ i \theta}, where theta is a random angle drawn from a gaussian distribution centered around 0, and with some width (is it actually a gaussian?). Thus the phase error for some particular point in (u,v) is exp{ i (theta1+theta2 )}, since both antennas could have a phase error. This should be dependent on the baseline, in that if we assume an antenna located at (0,0) has no phase error, we can use the phase structure function to determine the width of the distribution from which we draw the phase for all the other antennas. So the way we should deal with this is to use the .cfg file which lists the antenna positional information. Does it matter what point we use as a reference point for determining the baseline? If not it might be easiest to just use b = np.sqrt(x_2 +y _2) to determine the width of the distribution from which to draw the phase where x and y are the x and y position of each antenna.

wmorning commented 9 years ago

Progress report: As we discussed in the telecon last friday, to get phase errors that obey the correct short and long baseline properties, we have to simulate a field representing the phase shift above a certain position, and then determine the phase error for each individual antenna by using its position in the field. I have tried to do this as follows:

For an array of baseline lengths, ranging from 0m to some maximum distance (which must be greater than the maximum separation between antennas), draw a value from a gaussian centered around zero, with width determined by the phase structure function ( phi_rms = K/lambda(mm) *b(km)^alpha degrees). The drawn value is taken to be the amplitude of phase fluctuations at that length scale.

The simulated phase value at some point (x,y) is determined by taking the sum of A_cos(2_pi_x'/b +phi1)_cos(2_pi_y'/b +phi2), where A is the amplitude of the phase fluctuations at length scale b. phi1 and phi2 are uniformly distributed random numbers between 0 and 2_pi, which we should use because the fluctuations shouldn't all constructively interfere at x=y=0 . Additionally, so that there is no preferred axes for the phase fluctuations, we use x', y' = x_cos(theta)-y_sin(theta), y_cos(theta)-x_sin(theta) where theta is another random number drawn from between 0 and 2_pi. We divide the result by the number of sample points in b (because if one steps back far enough, this looks effectively like a discrete cosine transformation). It might be that we should multiply the result by 2 (from wikipedia, the discrete cosine transform looks to have a normalization of sqrt(2/N) in 1-d --> it should probably have a normalization of 2/N in 2-d).

This gives us a field full of "random" fluctuations, where the dominant fluctuations are for long distance, and there are many smaller ones over short distances. to get phase errors for a particular antenna, we just do a bilinear interpolation (or a nearest neighbor interpolation) to get the phase at a particular antenna, and use exp(i * (phase1-phase2) to phase shift a visibility between two antennas. I'll push an ipython notebook with a demo of this to github if I get time to do so later today. If not, I'll do it tomorrow.

So, the question is: Will this work to get us phase errors of the magnitude that we want, and will they have the correct baseline rms power spectrum shape (which should be the phase structure function)?

Once we figure this out, we can look into time evolution of the phase, which is likely to be more complicated.

dmarrone commented 9 years ago

Hi Warren,

I am not sure that the prescription above will produce what we want. I believe we want to produce a phase screen that can cover the whole array with spatial resolution sufficient to ensure that each telescope gets its own patch of phase (could need more resolution). I believe that what you want to do is to generate a 2D grid of gaussian random numbers that has a number of cells matching the conditions above, FT that grid, multiply the FT'd grid by the square root of the power spectrum (which you can get from the structure function), and FT back. Then those are the phases that you apply to each antenna.

wmorning commented 9 years ago

Ok, I think I got it to work that way. It makes an image which looks kind of like the CMB once you run it. I'll post the notebook shortly.

wmorning commented 5 years ago

We have phase errors, amplitude errors, and decorrelation built in.