Closed jburba closed 3 years ago
Check out this pull request on
You'll be able to see Jupyter notebook diff and discuss changes. Powered by ReviewNB.
I am currently updating the notebook to include the flag retrieval and data subset creation. I'll push a new commit once I have these sections added. @steven-murray
View / edit / reply to this conversation on ReviewNB
steven-murray commented on 2020-04-16T18:58:51Z ----------------------------------------------------------------
Let's put this function definition down below the Intro meta stuff.
View / edit / reply to this conversation on ReviewNB
steven-murray commented on 2020-04-16T18:58:52Z ----------------------------------------------------------------
Major step description should be broader, something like "Test pipeline components related to RFI".
Minor variation is more along the lines of what you have for the major description. However, I'd try to be a bit more explicit here (make it dot-point!): in short dots, outline the whole process, something like:
Let's discuss criteria on the telecon.
jburba commented on 2020-04-23T20:06:03Z ----------------------------------------------------------------
Expanded the major and minor descriptions, but left the criteria untouched.
View / edit / reply to this conversation on ReviewNB
steven-murray commented on 2020-04-16T18:58:53Z ----------------------------------------------------------------
This needs to be filled out (has to be a pass/fail note here, as well as any general comments).
View / edit / reply to this conversation on ReviewNB
steven-murray commented on 2020-04-16T18:58:54Z ----------------------------------------------------------------
where does vis_clean
live?
jburba commented on 2020-04-23T17:33:41Z ----------------------------------------------------------------
vis_clean
lives inside hera_cal.VisClean.vis_clean
and the hera_cal.Delay_Filter
class is a child class of VisClean
. I can spell that out more clearly in the document if need be.
steven-murray commented on 2020-04-30T13:41:08Z ----------------------------------------------------------------
All good, was just wondering if another version had to be printed.
View / edit / reply to this conversation on ReviewNB
steven-murray commented on 2020-04-16T18:58:55Z ----------------------------------------------------------------
Might want to comment here on how visibilities were downsampled to create the following files (could also point to a script that made the following files).
Also, it seems this text was written before you did the full analysis -- should be updated to reflect that you've done the full analysis as well (if you did... I'm not there yet ;-))
View / edit / reply to this conversation on ReviewNB
steven-murray commented on 2020-04-16T18:58:55Z ----------------------------------------------------------------
Let's decide on whether these are the final locations and stick to it.
View / edit / reply to this conversation on ReviewNB
steven-murray commented on 2020-04-16T18:58:56Z ----------------------------------------------------------------
I think this is fine for now -- the parameters are "based off" but not quite equal to the IDR2 analysis. But the IDR2 parameters seem to constantly change, so there's probably not too much point in trying to perfectly replicate them in this notebook anyway. I think we'll want to do a 3.1.1 when the final IDR2 parameters are chosen, and stick exactly to those parameters.
jburba commented on 2020-04-23T20:15:49Z ----------------------------------------------------------------
I agree that doing an analysis with parameters that are still being tweaked is tricky. But, would that mean calling this test failed, at least temporarily?
steven-murray commented on 2020-04-30T13:45:20Z ----------------------------------------------------------------
Not necessarily -- we could pass this test with current parameters, and then update parameters for a 3.1.1 test and also pass that.
View / edit / reply to this conversation on ReviewNB
steven-murray commented on 2020-04-16T18:58:57Z ----------------------------------------------------------------
You forget to mention the cyan box in the caption
jburba commented on 2020-04-23T20:16:57Z ----------------------------------------------------------------
Added!
View / edit / reply to this conversation on ReviewNB
steven-murray commented on 2020-04-16T18:58:58Z ----------------------------------------------------------------
Perhaps put in a short text explanation of what you're about to show before showing this plot (the text before it suggests that this will be of CLEANed components).
jburba commented on 2020-04-23T20:22:07Z ----------------------------------------------------------------
I think I've addressed this one with:
To get a sense of the overall level in the power spectrum of the various simulation components, the following cell produces a plot of the delay power spectra per spw of the individual simulation components _without any flags applied_. The red lines in the lower plots mark the delay power spectrum for the summed dataset and represents the "true" power spectrum that we are trying to recover by in-painting the gaps created by flags in the data. Of particular interest in the plots below are the regions in delay space where the EoR dominates the foreground components._steven-murray commented on 2020-04-30T13:47:00Z_ ----------------------------------------------------------------
Nice, thanks!
View / edit / reply to this conversation on ReviewNB
steven-murray commented on 2020-04-16T18:58:58Z ----------------------------------------------------------------
Again, let's put in a line of text explaining why you're gonna do the <10% cut, before showing the plot.
View / edit / reply to this conversation on ReviewNB
steven-murray commented on 2020-04-16T18:58:59Z ----------------------------------------------------------------
Can you give some qualitative conclusions about the relative merits of the 10% cut and different baseline orientations? Is there any real difference, other than with the visibilities themselves?
jburba commented on 2020-04-24T16:21:22Z ----------------------------------------------------------------
Visually, there doesn't seem to be too much of a difference, but I think this is where fractional difference plots of the recovered power spectra would come in handy. We can plot the 10% cuts and different bl orientations on top of one another and see if there's any notable differences.
steven-murray commented on 2020-05-13T21:25:23Z ----------------------------------------------------------------
That could be useful! Could you maybe make just one of these?
View / edit / reply to this conversation on ReviewNB
steven-murray commented on 2020-04-16T18:59:00Z ----------------------------------------------------------------
Let's add the relevant LST range to the caption here, so we don't have to know which section we're in to understand it.
View / edit / reply to this conversation on ReviewNB
steven-murray commented on 2020-04-16T18:59:01Z ----------------------------------------------------------------
The power spectra in a couple of cases here turnover above zero. Is this expected?
jburba commented on 2020-04-27T19:59:21Z ----------------------------------------------------------------
I don't know, but I wouldn't think so. That would suggest that the monopole has been shifted? Or that the monopole is a cusp of sorts, and that shouldn't be the case either?
jburba commented on 2020-04-27T19:59:38Z ----------------------------------------------------------------
(this is Fig. 17 now that I added a figure in the data creation section)
steven-murray commented on 2020-05-07T20:29:24Z ----------------------------------------------------------------
It means (if everything is working as expected) that there is more frequency structure than we would have guessed on large scales. It may also be a smoothing thing?
It doesn't seem to be related to the test being performed here and therefore I don't think we should spend much time on it. However, I'd add it as a bullet point to the top section {Brief notes on anything else interesting that was noted during testing}
jburba commented on 2020-05-11T22:03:22Z ----------------------------------------------------------------
I'm still slightly confused about what you mean by "turnover." Are you referring to the reflection-like peaks around delays of 200-500 ns?
steven-murray commented on 2020-05-13T21:24:03Z ----------------------------------------------------------------
I'm talking about the first peak at around ~100 ns, only visible in some of the plots. Most have the power flattening towards tau = 0, but some have it turning over.
View / edit / reply to this conversation on ReviewNB
steven-murray commented on 2020-04-16T18:59:01Z ----------------------------------------------------------------
You're doing essentially exactly the same steps just with a different input file for each LST. Can you make the steps into functions at the top, and then just run them? That'll save on some vertical space.
jburba commented on 2020-04-29T22:15:43Z ----------------------------------------------------------------
Functionizing has been completed, I think.
steven-murray commented on 2020-04-30T13:50:23Z ----------------------------------------------------------------
Hmmm... I'm not seeing it?
steven-murray commented on 2020-05-07T20:29:41Z ----------------------------------------------------------------
This is so much better.
View / edit / reply to this conversation on ReviewNB
steven-murray commented on 2020-04-16T18:59:02Z ----------------------------------------------------------------
I think we want to say here (if it's true) that this section contains the actual results of this test in terms of whether it passes or not. This is where you're trying to do the same thing as the real pipeline.
jburba commented on 2020-04-29T22:23:34Z ----------------------------------------------------------------
Took a swing at this, let me know what you think.
steven-murray commented on 2020-04-30T13:52:58Z ----------------------------------------------------------------
Again, I'm not sure I'm seeing this?
jburba commented on 2020-05-11T19:28:27Z ----------------------------------------------------------------
You're not seeing the change? Or you're not seeing the point addressed in what I wrote? This is in reference to the written section under Section 3 "H1C_IDR2 and hera-pspec Analysis.
steven-murray commented on 2020-05-13T21:26:03Z ----------------------------------------------------------------
I think it's all good now. You just hadn't pushed the changes when I looked at it.
View / edit / reply to this conversation on ReviewNB
steven-murray commented on 2020-04-16T18:59:03Z ----------------------------------------------------------------
Is there any way to version those scripts?
jburba commented on 2020-04-29T22:24:13Z ----------------------------------------------------------------
Like a git hash for the repository? Or script specific versioning?
steven-murray commented on 2020-04-30T13:53:28Z ----------------------------------------------------------------
Yeah, maybe a githash for the repo.
jburba commented on 2020-05-11T19:30:48Z ----------------------------------------------------------------
Githash for the H1C_IDR2 repo added to the introduction with the other package hashes.
View / edit / reply to this conversation on ReviewNB
steven-murray commented on 2020-04-16T18:59:04Z ----------------------------------------------------------------
There's clearly something weird going on in the in-painted versions of these, at pretty much every LST. Not so much for (515, 695), but even for (150, 350), which I wouldn't have expected. Is there some parameter that's very different in the IDR2 pipeline than what you're using in the notebook above?
jburba commented on 2020-04-29T22:25:37Z ----------------------------------------------------------------
I've redone the in-painting using values that Nick says are what are currently being used (at least for now). The new ps waterfalls look much better.
jburba commented on 2020-04-29T22:41:50Z ----------------------------------------------------------------
There's still a shoulder which I think is some sort of flagging sidelobe in the ps waterfalls, but it only goes out to ~ the maximum delay used by the in-painting, so I'm pretty thinking it's an artifact of the GSM CLEAN residuals (these are the dominant CLEAN residuals) os some other CLEAN residual.
steven-murray commented on 2020-04-30T13:56:04Z ----------------------------------------------------------------
I'm not sure I'm seeing this. I think we discussed on a telecon that it might be useful to overlay a grey transparent box over the delays you haven't even tried to clean (or just omit them from the plot), so that we don't get confused looking at it.
steven-murray commented on 2020-05-07T20:31:50Z ----------------------------------------------------------------
OK, grey box went the inverse way than I was expecting, but that also works :-).
It's much clearer now, and certainly the results are looking much better. In fact, when using the 10% threshold, the pspec results look better than the DFT ones.
View / edit / reply to this conversation on ReviewNB
steven-murray commented on 2020-04-16T18:59:05Z ----------------------------------------------------------------
The in-painted pspecs here are obviously failing. Question is -- do we call it a fail and move on, or try to fix it?
jburba commented on 2020-04-29T22:42:18Z ----------------------------------------------------------------
The in-painting v2 ps are still a bit strange, but they're much better than before.
steven-murray commented on 2020-05-07T20:32:52Z ----------------------------------------------------------------
Yup, certainly much better.
View / edit / reply to this conversation on ReviewNB
steven-murray commented on 2020-04-16T18:59:05Z ----------------------------------------------------------------
I have a feeling we can't trust that the peak-normalised comparison is fair, since the range of k_para used in each case is different. Sometimes the peak for the FFT is at a k_para that pspec doesn't evaluate, which makes the comparison a bit bogus. However, even so, I don't see how these are going to line up at all. This is concerning.
jburba commented on 2020-04-29T22:43:08Z ----------------------------------------------------------------
I don't know how valuable this plot is for the test itself, but I think it's interesting. I think you're right though, that it can't necessarily be trusted.
steven-murray commented on 2020-04-30T13:58:05Z ----------------------------------------------------------------
I feel like we should dig into this, but not necessarily in this test. Perhaps one of the comments you can make in the top section for further followup is to check why these curves don't line up.
steven-murray commented on 2020-05-07T20:32:34Z ----------------------------------------------------------------
I'll re-iterate this -- it can be a point under the one about the turnover at low non-zero k.
jburba commented on 2020-05-11T21:48:05Z ----------------------------------------------------------------
Is it worth leaving this plot in this notebook? It doesn't necessarily pertain to the test here, but it's still weird. If we keep it in, I agree that it should be mentioned in the intro somewhere about weird observed behavior.
steven-murray commented on 2020-05-13T21:27:36Z ----------------------------------------------------------------
I think it can be left in. The rationale for including it is to explore why results with in-notebook analysis vs pspec are as different as they are. It doesn't need to be gone into detail though.
In addition to my comments on the notebook itself, we need to discuss what the final conditions for success are. As it stands, I'm inclined to say this fails for now (because of the discrepancy of pspec with FFT), but it seems silly to let such a big discrepancy go so easily. Perhaps we should dig into that a bit more.
As for conditions of success, I'd say something like the following:
"For in-painting over the full range of LSTs in IDR2.2, performed with the hera_pspec pipeline, in spectral-windows for which no fully-flagged channels exist within the central 50% of the window, and in which flag occupancy of the central 50% is not above X%, the residual of the recovered power to the true power should be within Y%"
vis_clean
lives inside hera_cal.VisClean.vis_clean
and the hera_cal.Delay_Filter
class is a child class of VisClean
. I can spell that out more clearly in the document if need be.
View entire conversation on ReviewNB
Expanded the major and minor descriptions, but left the criteria untouched.
View entire conversation on ReviewNB
I agree that doing an analysis with parameters that are still being tweaked is tricky. But, would that mean calling this test failed, at least temporarily?
View entire conversation on ReviewNB
I think I've addressed this one with:
To get a sense of the overall level in the power spectrum of the various simulation components, the following cell produces a plot of the delay power spectra per spw of the individual simulation components _without any flags applied_. The red lines in the lower plots mark the delay power spectrum for the summed dataset and represents the "true" power spectrum that we are trying to recover by in-painting the gaps created by flags in the data. Of particular interest in the plots below are the regions in delay space where the EoR dominates the foreground components.--- View entire conversation on ReviewNB
Visually, there doesn't seem to be too much of a difference, but I think this is where fractional difference plots of the recovered power spectra would come in handy. We can plot the 10% cuts and different bl orientations on top of one another and see if there's any notable differences.
View entire conversation on ReviewNB
I don't know, but I wouldn't think so. That would suggest that the monopole has been shifted? Or that the monopole is a cusp of sorts, and that shouldn't be the case either?
View entire conversation on ReviewNB
(this is Fig. 17 now that I added a figure in the data creation section)
View entire conversation on ReviewNB
Like a git hash for the repository? Or script specific versioning?
View entire conversation on ReviewNB
I've redone the in-painting using values that Nick says are what are currently being used (at least for now). The new ps waterfalls look much better.
View entire conversation on ReviewNB
There's still a shoulder which I think is some sort of flagging sidelobe in the ps waterfalls, but it only goes out to ~ the maximum delay used by the in-painting, so I'm pretty thinking it's an artifact of the GSM CLEAN residuals (these are the dominant CLEAN residuals) os some other CLEAN residual.
View entire conversation on ReviewNB
The in-painting v2 ps are still a bit strange, but they're much better than before.
View entire conversation on ReviewNB
I don't know how valuable this plot is for the test itself, but I think it's interesting. I think you're right though, that it can't necessarily be trusted.
View entire conversation on ReviewNB
I think that success statement is reasonable, but I think we also need to place some k-based constraint on that statement, i.e. at k = 0.1 inv Mpc or something like that because the recovery is k dependent. Thoughts?
All good, was just wondering if another version had to be printed.
View entire conversation on ReviewNB
Not necessarily -- we could pass this test with current parameters, and then update parameters for a 3.1.1 test and also pass that.
View entire conversation on ReviewNB
I'm not sure I'm seeing this. I think we discussed on a telecon that it might be useful to overlay a grey transparent box over the delays you haven't even tried to clean (or just omit them from the plot), so that we don't get confused looking at it.
View entire conversation on ReviewNB
I feel like we should dig into this, but not necessarily in this test. Perhaps one of the comments you can make in the top section for further followup is to check why these curves don't line up.
View entire conversation on ReviewNB
Oh yes, definitely need to add a statement about k in there. I'm going to throw some random numbers out there for X and Y: X=10%, Y = 20%. I guess k should be related to the min_delay parameter you set?
It means (if everything is working as expected) that there is more frequency structure than we would have guessed on large scales. It may also be a smoothing thing?
It doesn't seem to be related to the test being performed here and therefore I don't think we should spend much time on it. However, I'd add it as a bullet point to the top section {Brief notes on anything else interesting that was noted during testing}
View entire conversation on ReviewNB
OK, grey box went the inverse way than I was expecting, but that also works :-).
It's much clearer now, and certainly the results are looking much better. In fact, when using the 10% threshold, the pspec results look better than the DFT ones.
View entire conversation on ReviewNB
I think the test 3.1 notebook is in decent shape, however, the criteria section at the top and the test results are still TBD. I'm not really sure what the end goal of this notebook was in words. I know it wasn't to check that the EoR is recovered after in-painting, but maybe more that the in-painting works at some level but what that level is and how we demonstrate success I'm not sure about. The recovered power spectra after in-painting, with the IDR2.2 parameters at least, don't actually recover the "true" un-flagged power spectra. One major takeaway from this which I think should go somewhere in this document, though, is that the performance of the in-painting is highly dependent upon the flags and spectral window(s). Maybe this test isn't so much of a "test," in the traditional sense of power spectrum recovery to within some percentage level, as it is to see and quantify the effects of in-painting inside the H1C analysis pipeline?