Test 3.1.0 - Githubissues

View / edit / reply to this conversation on ReviewNB

steven-murray commented on 2020-04-16T18:58:51Z ----------------------------------------------------------------

Let's put this function definition down below the Intro meta stuff.

review-notebook-app[bot] commented 4 years ago

View / edit / reply to this conversation on ReviewNB

steven-murray commented on 2020-04-16T18:58:52Z ----------------------------------------------------------------

Major step description should be broader, something like "Test pipeline components related to RFI".

Minor variation is more along the lines of what you have for the major description. However, I'd try to be a bit more explicit here (make it dot-point!): in short dots, outline the whole process, something like:

Use RIMEz to generate visibilities including GLEAM, eGSM and a P(k) = Ak^{-2} EoR power spectrum
Generate LST-binned flags from H1C IDR2.2 data, days XXX-YYY.
Apply LST-binned flags to the visibilities.
Run vis_clean to in-paint flagged visibilities
Run pspec (according to blessed H1C pipeline) on cleaned visibilities
Compare output power spectra from different time/frequency windows against input power spectra.

Let's discuss criteria on the telecon.

jburba commented on 2020-04-23T20:06:03Z ----------------------------------------------------------------

Expanded the major and minor descriptions, but left the criteria untouched.

review-notebook-app[bot] commented 4 years ago

View / edit / reply to this conversation on ReviewNB

steven-murray commented on 2020-04-16T18:58:53Z ----------------------------------------------------------------

This needs to be filled out (has to be a pass/fail note here, as well as any general comments).

review-notebook-app[bot] commented 4 years ago

View / edit / reply to this conversation on ReviewNB

steven-murray commented on 2020-04-16T18:58:54Z ----------------------------------------------------------------

where does vis_clean live?

jburba commented on 2020-04-23T17:33:41Z ----------------------------------------------------------------

vis_clean lives inside hera_cal.VisClean.vis_clean and the hera_cal.Delay_Filter class is a child class of VisClean. I can spell that out more clearly in the document if need be.

steven-murray commented on 2020-04-30T13:41:08Z ----------------------------------------------------------------

All good, was just wondering if another version had to be printed.

review-notebook-app[bot] commented 4 years ago

View / edit / reply to this conversation on ReviewNB

steven-murray commented on 2020-04-16T18:58:55Z ----------------------------------------------------------------

Might want to comment here on how visibilities were downsampled to create the following files (could also point to a script that made the following files).

Also, it seems this text was written before you did the full analysis -- should be updated to reflect that you've done the full analysis as well (if you did... I'm not there yet ;-))

review-notebook-app[bot] commented 4 years ago

View / edit / reply to this conversation on ReviewNB

steven-murray commented on 2020-04-16T18:58:55Z ----------------------------------------------------------------

Let's decide on whether these are the final locations and stick to it.

review-notebook-app[bot] commented 4 years ago

View / edit / reply to this conversation on ReviewNB

steven-murray commented on 2020-04-16T18:58:56Z ----------------------------------------------------------------

I think this is fine for now -- the parameters are "based off" but not quite equal to the IDR2 analysis. But the IDR2 parameters seem to constantly change, so there's probably not too much point in trying to perfectly replicate them in this notebook anyway. I think we'll want to do a 3.1.1 when the final IDR2 parameters are chosen, and stick exactly to those parameters.

jburba commented on 2020-04-23T20:15:49Z ----------------------------------------------------------------

I agree that doing an analysis with parameters that are still being tweaked is tricky. But, would that mean calling this test failed, at least temporarily?

steven-murray commented on 2020-04-30T13:45:20Z ----------------------------------------------------------------

Not necessarily -- we could pass this test with current parameters, and then update parameters for a 3.1.1 test and also pass that.

review-notebook-app[bot] commented 4 years ago

View / edit / reply to this conversation on ReviewNB

steven-murray commented on 2020-04-16T18:58:57Z ----------------------------------------------------------------

You forget to mention the cyan box in the caption

jburba commented on 2020-04-23T20:16:57Z ----------------------------------------------------------------

Added!

review-notebook-app[bot] commented 4 years ago

View / edit / reply to this conversation on ReviewNB

steven-murray commented on 2020-04-16T18:58:58Z ----------------------------------------------------------------

Perhaps put in a short text explanation of what you're about to show before showing this plot (the text before it suggests that this will be of CLEANed components).

jburba commented on 2020-04-23T20:22:07Z ----------------------------------------------------------------

I think I've addressed this one with:

To get a sense of the overall level in the power spectrum of the various simulation components, the following cell produces a plot of the delay power spectra per spw of the individual simulation components _without any flags applied_. The red lines in the lower plots mark the delay power spectrum for the summed dataset and represents the "true" power spectrum that we are trying to recover by in-painting the gaps created by flags in the data. Of particular interest in the plots below are the regions in delay space where the EoR dominates the foreground components.

_steven-murray commented on 2020-04-30T13:47:00Z_ ----------------------------------------------------------------

Nice, thanks!

review-notebook-app[bot] commented 4 years ago

View / edit / reply to this conversation on ReviewNB

steven-murray commented on 2020-04-16T18:58:58Z ----------------------------------------------------------------

Again, let's put in a line of text explaining why you're gonna do the <10% cut, before showing the plot.

review-notebook-app[bot] commented 4 years ago

View / edit / reply to this conversation on ReviewNB

steven-murray commented on 2020-04-16T18:58:59Z ----------------------------------------------------------------

Can you give some qualitative conclusions about the relative merits of the 10% cut and different baseline orientations? Is there any real difference, other than with the visibilities themselves?

jburba commented on 2020-04-24T16:21:22Z ----------------------------------------------------------------

Visually, there doesn't seem to be too much of a difference, but I think this is where fractional difference plots of the recovered power spectra would come in handy. We can plot the 10% cuts and different bl orientations on top of one another and see if there's any notable differences.

steven-murray commented on 2020-05-13T21:25:23Z ----------------------------------------------------------------

That could be useful! Could you maybe make just one of these?

review-notebook-app[bot] commented 4 years ago

View / edit / reply to this conversation on ReviewNB

steven-murray commented on 2020-04-16T18:59:00Z ----------------------------------------------------------------

Let's add the relevant LST range to the caption here, so we don't have to know which section we're in to understand it.

review-notebook-app[bot] commented 4 years ago

View / edit / reply to this conversation on ReviewNB

steven-murray commented on 2020-04-16T18:59:01Z ----------------------------------------------------------------

The power spectra in a couple of cases here turnover above zero. Is this expected?

jburba commented on 2020-04-27T19:59:21Z ----------------------------------------------------------------

I don't know, but I wouldn't think so. That would suggest that the monopole has been shifted? Or that the monopole is a cusp of sorts, and that shouldn't be the case either?

jburba commented on 2020-04-27T19:59:38Z ----------------------------------------------------------------

(this is Fig. 17 now that I added a figure in the data creation section)

steven-murray commented on 2020-05-07T20:29:24Z ----------------------------------------------------------------

It means (if everything is working as expected) that there is more frequency structure than we would have guessed on large scales. It may also be a smoothing thing?

It doesn't seem to be related to the test being performed here and therefore I don't think we should spend much time on it. However, I'd add it as a bullet point to the top section {Brief notes on anything else interesting that was noted during testing}

jburba commented on 2020-05-11T22:03:22Z ----------------------------------------------------------------

I'm still slightly confused about what you mean by "turnover." Are you referring to the reflection-like peaks around delays of 200-500 ns?

steven-murray commented on 2020-05-13T21:24:03Z ----------------------------------------------------------------

I'm talking about the first peak at around ~100 ns, only visible in some of the plots. Most have the power flattening towards tau = 0, but some have it turning over.

review-notebook-app[bot] commented 4 years ago

View / edit / reply to this conversation on ReviewNB

steven-murray commented on 2020-04-16T18:59:01Z ----------------------------------------------------------------

You're doing essentially exactly the same steps just with a different input file for each LST. Can you make the steps into functions at the top, and then just run them? That'll save on some vertical space.

jburba commented on 2020-04-29T22:15:43Z ----------------------------------------------------------------

Functionizing has been completed, I think.

steven-murray commented on 2020-04-30T13:50:23Z ----------------------------------------------------------------

Hmmm... I'm not seeing it?

steven-murray commented on 2020-05-07T20:29:41Z ----------------------------------------------------------------

This is so much better.

review-notebook-app[bot] commented 4 years ago

View / edit / reply to this conversation on ReviewNB

steven-murray commented on 2020-04-16T18:59:02Z ----------------------------------------------------------------

I think we want to say here (if it's true) that this section contains the actual results of this test in terms of whether it passes or not. This is where you're trying to do the same thing as the real pipeline.

jburba commented on 2020-04-29T22:23:34Z ----------------------------------------------------------------

Took a swing at this, let me know what you think.

steven-murray commented on 2020-04-30T13:52:58Z ----------------------------------------------------------------

Again, I'm not sure I'm seeing this?

jburba commented on 2020-05-11T19:28:27Z ----------------------------------------------------------------

You're not seeing the change? Or you're not seeing the point addressed in what I wrote? This is in reference to the written section under Section 3 "H1C_IDR2 and hera-pspec Analysis.

steven-murray commented on 2020-05-13T21:26:03Z ----------------------------------------------------------------

I think it's all good now. You just hadn't pushed the changes when I looked at it.

review-notebook-app[bot] commented 4 years ago

View / edit / reply to this conversation on ReviewNB

steven-murray commented on 2020-04-16T18:59:03Z ----------------------------------------------------------------

Is there any way to version those scripts?

jburba commented on 2020-04-29T22:24:13Z ----------------------------------------------------------------

Like a git hash for the repository? Or script specific versioning?

steven-murray commented on 2020-04-30T13:53:28Z ----------------------------------------------------------------

Yeah, maybe a githash for the repo.

jburba commented on 2020-05-11T19:30:48Z ----------------------------------------------------------------

Githash for the H1C_IDR2 repo added to the introduction with the other package hashes.

review-notebook-app[bot] commented 4 years ago

View / edit / reply to this conversation on ReviewNB

steven-murray commented on 2020-04-16T18:59:04Z ----------------------------------------------------------------

There's clearly something weird going on in the in-painted versions of these, at pretty much every LST. Not so much for (515, 695), but even for (150, 350), which I wouldn't have expected. Is there some parameter that's very different in the IDR2 pipeline than what you're using in the notebook above?

jburba commented on 2020-04-29T22:25:37Z ----------------------------------------------------------------

I've redone the in-painting using values that Nick says are what are currently being used (at least for now). The new ps waterfalls look much better.

jburba commented on 2020-04-29T22:41:50Z ----------------------------------------------------------------

There's still a shoulder which I think is some sort of flagging sidelobe in the ps waterfalls, but it only goes out to ~ the maximum delay used by the in-painting, so I'm pretty thinking it's an artifact of the GSM CLEAN residuals (these are the dominant CLEAN residuals) os some other CLEAN residual.

steven-murray commented on 2020-04-30T13:56:04Z ----------------------------------------------------------------

I'm not sure I'm seeing this. I think we discussed on a telecon that it might be useful to overlay a grey transparent box over the delays you haven't even tried to clean (or just omit them from the plot), so that we don't get confused looking at it.

steven-murray commented on 2020-05-07T20:31:50Z ----------------------------------------------------------------

OK, grey box went the inverse way than I was expecting, but that also works :-).

It's much clearer now, and certainly the results are looking much better. In fact, when using the 10% threshold, the pspec results look better than the DFT ones.

review-notebook-app[bot] commented 4 years ago

View / edit / reply to this conversation on ReviewNB

steven-murray commented on 2020-04-16T18:59:05Z ----------------------------------------------------------------

The in-painted pspecs here are obviously failing. Question is -- do we call it a fail and move on, or try to fix it?

jburba commented on 2020-04-29T22:42:18Z ----------------------------------------------------------------

The in-painting v2 ps are still a bit strange, but they're much better than before.

steven-murray commented on 2020-05-07T20:32:52Z ----------------------------------------------------------------

Yup, certainly much better.

review-notebook-app[bot] commented 4 years ago

View / edit / reply to this conversation on ReviewNB

steven-murray commented on 2020-04-16T18:59:05Z ----------------------------------------------------------------

I have a feeling we can't trust that the peak-normalised comparison is fair, since the range of k_para used in each case is different. Sometimes the peak for the FFT is at a k_para that pspec doesn't evaluate, which makes the comparison a bit bogus. However, even so, I don't see how these are going to line up at all. This is concerning.

jburba commented on 2020-04-29T22:43:08Z ----------------------------------------------------------------

I don't know how valuable this plot is for the test itself, but I think it's interesting. I think you're right though, that it can't necessarily be trusted.

steven-murray commented on 2020-04-30T13:58:05Z ----------------------------------------------------------------

I feel like we should dig into this, but not necessarily in this test. Perhaps one of the comments you can make in the top section for further followup is to check why these curves don't line up.

steven-murray commented on 2020-05-07T20:32:34Z ----------------------------------------------------------------

I'll re-iterate this -- it can be a point under the one about the turnover at low non-zero k.

jburba commented on 2020-05-11T21:48:05Z ----------------------------------------------------------------

Is it worth leaving this plot in this notebook? It doesn't necessarily pertain to the test here, but it's still weird. If we keep it in, I agree that it should be mentioned in the intro somewhere about weird observed behavior.

steven-murray commented on 2020-05-13T21:27:36Z ----------------------------------------------------------------

I think it can be left in. The rationale for including it is to explore why results with in-notebook analysis vs pspec are as different as they are. It doesn't need to be gone into detail though.

steven-murray commented 4 years ago

In addition to my comments on the notebook itself, we need to discuss what the final conditions for success are. As it stands, I'm inclined to say this fails for now (because of the discrepancy of pspec with FFT), but it seems silly to let such a big discrepancy go so easily. Perhaps we should dig into that a bit more.

As for conditions of success, I'd say something like the following:

"For in-painting over the full range of LSTs in IDR2.2, performed with the hera_pspec pipeline, in spectral-windows for which no fully-flagged channels exist within the central 50% of the window, and in which flag occupancy of the central 50% is not above X%, the residual of the recovered power to the true power should be within Y%"

jburba commented 4 years ago

vis_clean lives inside hera_cal.VisClean.vis_clean and the hera_cal.Delay_Filter class is a child class of VisClean. I can spell that out more clearly in the document if need be.

View entire conversation on ReviewNB

jburba commented 4 years ago

Expanded the major and minor descriptions, but left the criteria untouched.

View entire conversation on ReviewNB

jburba commented 4 years ago

I agree that doing an analysis with parameters that are still being tweaked is tricky. But, would that mean calling this test failed, at least temporarily?

View entire conversation on ReviewNB

jburba commented 4 years ago

Added!

View entire conversation on ReviewNB

jburba commented 4 years ago

I think I've addressed this one with:

To get a sense of the overall level in the power spectrum of the various simulation components, the following cell produces a plot of the delay power spectra per spw of the individual simulation components _without any flags applied_. The red lines in the lower plots mark the delay power spectrum for the summed dataset and represents the "true" power spectrum that we are trying to recover by in-painting the gaps created by flags in the data. Of particular interest in the plots below are the regions in delay space where the EoR dominates the foreground components.

--- View entire conversation on ReviewNB

jburba commented 4 years ago

View entire conversation on ReviewNB

jburba commented 4 years ago

I don't know, but I wouldn't think so. That would suggest that the monopole has been shifted? Or that the monopole is a cusp of sorts, and that shouldn't be the case either?

View entire conversation on ReviewNB

jburba commented 4 years ago

(this is Fig. 17 now that I added a figure in the data creation section)

View entire conversation on ReviewNB

jburba commented 4 years ago

Functionizing has been completed, I think.

View entire conversation on ReviewNB

jburba commented 4 years ago

Took a swing at this, let me know what you think.

View entire conversation on ReviewNB

jburba commented 4 years ago

Like a git hash for the repository? Or script specific versioning?

View entire conversation on ReviewNB

jburba commented 4 years ago

I've redone the in-painting using values that Nick says are what are currently being used (at least for now). The new ps waterfalls look much better.

View entire conversation on ReviewNB

jburba commented 4 years ago

View entire conversation on ReviewNB

jburba commented 4 years ago

The in-painting v2 ps are still a bit strange, but they're much better than before.

View entire conversation on ReviewNB

jburba commented 4 years ago

I don't know how valuable this plot is for the test itself, but I think it's interesting. I think you're right though, that it can't necessarily be trusted.

View entire conversation on ReviewNB

jburba commented 4 years ago

I think that success statement is reasonable, but I think we also need to place some k-based constraint on that statement, i.e. at k = 0.1 inv Mpc or something like that because the recovery is k dependent. Thoughts?

steven-murray commented 4 years ago

All good, was just wondering if another version had to be printed.

View entire conversation on ReviewNB

steven-murray commented 4 years ago

Not necessarily -- we could pass this test with current parameters, and then update parameters for a 3.1.1 test and also pass that.

View entire conversation on ReviewNB

steven-murray commented 4 years ago

Nice, thanks!

View entire conversation on ReviewNB

steven-murray commented 4 years ago

Hmmm... I'm not seeing it?

View entire conversation on ReviewNB

steven-murray commented 4 years ago

Again, I'm not sure I'm seeing this?

View entire conversation on ReviewNB

steven-murray commented 4 years ago

Yeah, maybe a githash for the repo.

View entire conversation on ReviewNB

steven-murray commented 4 years ago

View entire conversation on ReviewNB

steven-murray commented 4 years ago

I feel like we should dig into this, but not necessarily in this test. Perhaps one of the comments you can make in the top section for further followup is to check why these curves don't line up.

View entire conversation on ReviewNB

steven-murray commented 4 years ago

Oh yes, definitely need to add a statement about k in there. I'm going to throw some random numbers out there for X and Y: X=10%, Y = 20%. I guess k should be related to the min_delay parameter you set?

steven-murray commented 4 years ago

It means (if everything is working as expected) that there is more frequency structure than we would have guessed on large scales. It may also be a smoothing thing?

View entire conversation on ReviewNB

steven-murray commented 4 years ago

This is so much better.

View entire conversation on ReviewNB

steven-murray commented 4 years ago

OK, grey box went the inverse way than I was expecting, but that also works :-).

It's much clearer now, and certainly the results are looking much better. In fact, when using the 10% threshold, the pspec results look better than the DFT ones.

View entire conversation on ReviewNB

HERA-Team / hera-validation

Test 3.1.0 #55