EoRImaging / eppsilon

eppsilon - error propagated power spectrum with interleaved observed noise
BSD 2-Clause "Simplified" License
5 stars 4 forks source link

fix calculation of zero mode and shifting to work for even & odd n_freq #89

Closed bhazelton closed 5 years ago

bhazelton commented 5 years ago

Also add a bunch of checks along the way

wenyang-li commented 5 years ago

I have tested channel ranges from 0-95 to 0-99, so I have tested both even and odd channel numbers. The following plot shows the 1d ps for all these cases, which looks very continuous, and now there is no split at coarse band modes, so I think this bug is now fixed.

ps

nicholebarry commented 5 years ago

That is gorgeous agreement, Wenyang. I'll have a look at my data too.

nicholebarry commented 5 years ago

@bhazelton I see a change for both even and odd frequencies compared to the master -- is that what you expected? I had inferred that you only expected change in the odd frequencies.

bhazelton commented 5 years ago

I guess I'm not surprised that it would affect even numbers of frequencies as well, because the folding in kparallel depends on this. There were 2 separate problems, one that could affect both even and odd and one that would only affect odd numbers.

nicholebarry commented 5 years ago

I'd still like to continue looking at this pull request -- our limit goes up by x2 so this is very important.

nicholebarry commented 5 years ago

I was playing around with this today. If I force the Fourier frequency transform to be a DFT, I get a different answer than if I force it to be a FFT.

Now, I know there's a difference between a DFT and FFT, but they should give really similar answers. I think the difference warrants another look...

bhazelton commented 5 years ago

ok, I can take a look at the DFT case. I'm pretty confident this is correct for the FFT.

bhazelton commented 5 years ago

One good way to test that the zero mode is identified correctly (and that the shifting is correct) is to make a white gaussian signal and add a constant offset and then run it through the FT and check where the high peak is -- it should be the same as the mode identified as zero. I did that for the FFT, but I'll check it for the DFT.

bhazelton commented 5 years ago

Looking into the DFT vs FFT. First, I discovered that the exponent had the wrong sign: (plotting the absolute value after the FFT): Screen Shot 2019-04-23 at 12 41 46 PM

I think this might come out in the wash in the PS, but I fixed it to make comparing easier: Screen Shot 2019-04-23 at 12 42 13 PM

Finally the difference ratio: Screen Shot 2019-04-23 at 12 48 57 PM

So they are different. The difference is a bit more than I expect, I think. I'm still digging, but I thought I'd share what I'd found so far.

bhazelton commented 5 years ago

I think I've figured out why the DFT is different. I'm calculating the exponentials (for the FT integrals) directly from the comoving line of sight values for each frequency. These are not regularly spaced, so the values inside the exponents do not come out to exactly 2pi*integer/n_freq. Given this difference, I think this level of agreement is reasonable.

Incidentally, this also explains why the sign on the exponent is wrong, even though I tried to get it right. The comoving line of sight values decrease as frequency increases, so the values in the exponent get flipped.

bhazelton commented 5 years ago

If I use a regular grid in z mpc, I get closer to the FFT: Screen Shot 2019-04-23 at 2 50 49 PM

bhazelton commented 5 years ago

Actually, this is a better version of the regular z grid (still some gremlins in the last plot): Screen Shot 2019-04-23 at 3 24 45 PM

bhazelton commented 5 years ago

I also have a cleaned up version that uses the true comoving line of sight distance (but fixes a couple issues). It looks a bit better than before, but it is more different from the fft than the regular z spaced one. Screen Shot 2019-04-23 at 3 29 34 PM

bhazelton commented 5 years ago

Here are the PS diff plots vs the fft for the true comoving line of sight distance: fhd_nb_Aug2017_savedbp_w_cable_w_digjump_averemove_swbh_dencorr__Aug23_longrunstyle_minus_Aug23_longrunstyle_2dkpower

and the regularly spaced one: fhd_nb_Aug2017_savedbp_w_cable_w_digjump_averemove_swbh_dencorr__Aug23_longrunstyle_minus_Aug23_longrunstyle_2dkpower

bhazelton commented 5 years ago

I just pushed up the changes I used to make these plots. There isn't currently a way to do it from the wrapper (although I could add one if there's interest), but you can force a DFT by setting even_freq=0 on line 1352 of ps_kcube. Then you can pick between these two DFTs by swapping them out in line 1408 of ps_kcube.

nicholebarry commented 5 years ago

Okay, I'll compare it with the long run and post it. But otherwise, which one do you think is more right between the various forms of the DFT and the FFT? At the moment, we don't have enough frequencies to really need to choose between one or the other for speed reasons, so we can afford to be picky.

bhazelton commented 5 years ago

I think that either the FFT or the DFT using the true comoving line of sight distances are reasonable. The DFT might be slightly more correct in some sense, but it's less tested.

miguelfmorales commented 5 years ago

It is clear we are more sensitive to these effects than we'd appreciated, so I want to make sure we close the loop.

In particular, the change the frequency range and get similar results—which both Nichole and Wenyang have run at various times—I think is necessary but insufficient.

I'd like a careful analysis of what is actually happening in the FT (frequencies used, mapping to even-odd and zero, etc.) for both the fix Nichole did and the new one. Maybe a stop in the code, look at the frequency indexes and FT shift ranges, etc. And identify what is actually different at the setup to the FT stage and convince ourselves of what is correct (and what was happening before).

cathryntrott commented 5 years ago

Hi all,

I've been trying to follow this thread without having seen any non-git discussion. If you are mapping the true LOS distance and regularly spacing that for the LOS transform, then that is strictly correct, but without really fine frequency resolution to handle this elegantly, aren't you just introducing discretisation errors? Given the small depth of the observed "boxes", the actual changes in the mapping will be very small over the redshift range.

Cheers,

Cath


Cathryn Trott

Associate Professor ARC Future Fellow

ARC Centre of Excellence for All Sky Astrophysics in 3D (ASTRO 3D)

International Centre for Radio Astronomy Research Curtin University Bentley WA, Australia

cathryn.trott@curtin.edu.au


From: miguelfmorales notifications@github.com Sent: Thursday, April 25, 2019 7:59:15 AM To: EoRImaging/eppsilon Cc: Subscribed Subject: Re: [EoRImaging/eppsilon] fix calculation of zero mode and shifting to work for even & odd n_freq (#89)

It is clear we are more sensitive to these effects than we'd appreciated, so I want to make sure we close the loop.

In particular, the change the frequency range and get similar results—which both Nichole and Wenyang have run at various times—I think is necessary but insufficient.

I'd like a careful analysis of what is actually happening in the FT (frequencies used, mapping to even-odd and zero, etc.) for both the fix Nichole did and the new one. Maybe a stop in the code, look at the frequency indexes and FT shift ranges, etc. And identify what is actually different at the setup to the FT stage and convince ourselves of what is correct (and what was happening before).

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/EoRImaging/eppsilon/pull/89#issuecomment-486469976, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ABPDJX2ECHAD3I3TPDZN5EDPSDX5HANCNFSM4HGYTO6A.

bhazelton commented 5 years ago

Thanks, Cath, you're probably right. I haven't thought through the DFT thing well, it was in there to handle non-uniformly spaced frequencies and Nichole used it to compare to the FFT after my changes, so I dug into it to figure out what I could about differences. We certainly haven't scrutinized it at the level we have for the FFT.

To address Miguel's requests I need to know what frequency ranges you were looking at @nicholebarry. edit: I found the ranges you were plotting in our slack conversation. I'll work on the comparison.

nicholebarry commented 5 years ago

Played around with this a little today. Something noticeable is the change in the noise ratio. Previously, the noise ratio looked like:

Screen Shot 2019-04-28 at 9 29 49 pm

Now, with the update, it looks like this:

Screen Shot 2019-04-28 at 9 29 41 pm

It appears as though something is not quite right with the first kz bin in the noise.

bhazelton commented 5 years ago

Here is some data to answer @miguelfmorales questions:

In the below I'm calculating an "empirical zero mode" by making a vector of length n_freq with random gaussian noise, mean=5, width = 1 and FFTing it. The zero mode is then easily identified by picking the max bin. I then shift the FFTd vector in the same way as the data to figure out where the zero bin ends up empirically. With my new update, the shift was changed, so I'm doing it both ways to compare with the calculated zero modes (labelled as "old", "nb_fix1", "nb_fix2" using the old shift and "new" using the new shift). I've run it for all the channels and for a bunch of different channel ranges that Nichole was using.

all channels old shift empirical zero: 95 old identified zero bin: 95 nb fix1 identified zero bin: 96 nb fix2 identified zero bin: 96 new shift empirical zero: 95 new identified zero bin: 95

   9 to      121

old shift empirical zero: 55 old identified zero bin: 56 nb fix1 identified zero bin: 57 nb fix2 identified zero bin: 56 new shift empirical zero: 56 new identified zero bin: 56

   9 to      122

old shift empirical zero: 56 old identified zero bin: 57 nb fix1 identified zero bin: 57 nb fix2 identified zero bin: 57 new shift empirical zero: 56 new identified zero bin: 56

   9 to      123

old shift empirical zero: 56 old identified zero bin: 57 nb fix1 identified zero bin: 57 nb fix2 identified zero bin: 57 new shift empirical zero: 57 new identified zero bin: 57

   9 to      124

old shift empirical zero: 57 old identified zero bin: 58 nb fix1 identified zero bin: 58 nb fix2 identified zero bin: 58 new shift empirical zero: 57 new identified zero bin: 57

   9 to      125

old shift empirical zero: 57 old identified zero bin: 58 nb fix1 identified zero bin: 58 nb fix2 identified zero bin: 58 new shift empirical zero: 58 new identified zero bin: 58

   9 to      126

old shift empirical zero: 58 old identified zero bin: 58 nb fix1 identified zero bin: 59 nb fix2 identified zero bin: 59 new shift empirical zero: 58 new identified zero bin: 58

   9 to      127

old shift empirical zero: 58 old identified zero bin: 59 nb fix1 identified zero bin: 59 nb fix2 identified zero bin: 59 new shift empirical zero: 59 new identified zero bin: 59

bhazelton commented 5 years ago

@nicholebarry I tried and failed to reproduce the weird noise ratio issue with the golden set integration I'm working with (see plots for the full bandpass and a bunch of limited frequency ranges below). What frequency range are you seeing that problem with?

fhd_nb_Aug2017_savedbp_w_cable_w_digjump_Aug23_longrunstyle_averemove_swbh_dencorr_2dnnr fhd_nb_Aug2017_savedbp_w_cable_w_digjump_Aug23_longrunstyle_ch9-121_averemove_swbh_dencorr_2dnnr fhd_nb_Aug2017_savedbp_w_cable_w_digjump_Aug23_longrunstyle_ch9-122_averemove_swbh_dencorr_2dnnr fhd_nb_Aug2017_savedbp_w_cable_w_digjump_Aug23_longrunstyle_ch9-123_averemove_swbh_dencorr_2dnnr fhd_nb_Aug2017_savedbp_w_cable_w_digjump_Aug23_longrunstyle_ch9-124_averemove_swbh_dencorr_2dnnr fhd_nb_Aug2017_savedbp_w_cable_w_digjump_Aug23_longrunstyle_ch9-125_averemove_swbh_dencorr_2dnnr fhd_nb_Aug2017_savedbp_w_cable_w_digjump_Aug23_longrunstyle_ch9-126_averemove_swbh_dencorr_2dnnr fhd_nb_Aug2017_savedbp_w_cable_w_digjump_Aug23_longrunstyle_ch9-127_averemove_swbh_dencorr_2dnnr

nicholebarry commented 5 years ago

The frequency range I'm specifically testing is 9-126 (what I would like to use in the paper given instrumental effects).

By glancing at the name of the files you used, they appear to be during my thesis. That might be too old to be reliable, esp since that was when we either did postage stamp or image space BH windowing. We should get you an updated set. I can transfer my ps dir + Healpix cubes to Enterprise, just let me know where.