HERA-Team / hera_qm

HERA Data Quality Metrics
MIT License
2 stars 2 forks source link

Make watershed smarter/more flexible #253

Open adampbeardsley opened 5 years ago

adampbeardsley commented 5 years ago

In prepping for IDR2.2, I've noticed quite a few situations like this: image There is a clear streak of contamination, but the channels right next to existing flags are below our "adjacent" threshold, while some beyond are above. We could simply have the watershed look farther out, say three pixels. Paul suggested a quick test, which seems promising:

radii = np.arange(-r, r + 1).tolist()
radii.remove(0)
for dx, dy in itertools.chain(itertools.product([0], radii), itertools.product(radii, [0])):
    <body of loop as is>

Results: image It's in the right direction, but I think ultimately there needs to be significant tuning. Alternatively, I suspect Mike W's match filter would catch this stuff pretty easily.

jsdillon commented 5 years ago

What radius did you use there?

Also, perhaps we want to consider larger radii in frequency than in time. And maybe we want some kind of "filling in" where if there are less than N bins between two flags in some dimension, we fill in flags.

adampbeardsley commented 5 years ago

This was a radius of 3 pixels.

Yeah, would be interesting to set time/freq radii separately. And we talked a little about the filling in idea on slack. We agreed this would take some refactoring... and is likely a future project (hence the issue for now)

adampbeardsley commented 5 years ago

Note the example given actually isn't the best because it's trying to use the narrowband RFI to grab the more broadband feature. There are other cases (TODO: find them) where a pixel in the feature itself gets flagged, but we do not successfully flag the whole thing.

adampbeardsley commented 5 years ago

After @jsdillon's rewrite of the watershed #296 , I think this will be much more approachable.

jsdillon commented 5 years ago

Yeah, you just have to play around with changing the kernel.