astropy / astroscrappy

Speedy Cosmic Ray Annihilation Package in Python
70 stars 34 forks source link

WIP: Modify median filters to handle edges better #38

Open jehturner opened 6 years ago

jehturner commented 6 years ago

As per #12, here's a version of PyMedFilt5 that avoids edge residuals, behaving the same as IRAF's median with its default boundary="nearest" and ndimage.median_filter with the equivalent option.

It would be far cleaner to factor this into its own function that each median filter can call, but I notice that you haven't done that elsewhere, presumably for optimization reasons? It would save quite a lot of repetition here though -- what do you think? Also, would you prefer the edge filtering to be optional, since you say you have encountered artifacts associated with filtering at the edges? I'd suggest giving this method a try first, since I don't see obvious problems with it (even if the theory is a bit questionable).

I should perhaps move all the for loops into a single block and avoid re-allocating medarr...

jehturner commented 6 years ago

As a quick performance check, timeit shows that median filtering a 4176x512 pixel image 100 times takes ~12s on my 6-core workstation with both the old and new versions of PyMedFilt5 (for some reason the exact time varies by about 0.5s, but the difference is within the noise). This isn't really a surprise, since the behaviour only differs for 4 rows/columns out of 4176 or 512.

jehturner commented 6 years ago

Here's a version that reimplements the 5x5 median filter using an NxN median filter. From a quick check, it seems to be about 8% slower, presumably because the compiler can't optimize the inner loops as per your comment in the existing code -- but maybe it's useful to have an NxN filter anyway and I'll push it so you can have a look. Would you rather keep the triplicated cut-and-paste versions for performance? It's probably going to be quite fiddly to abstract them without some run-time penalty (eg. using the pre-processor or generated code?).

coveralls commented 6 years ago

Coverage Status

Coverage remained the same at 100.0% when pulling 4876fc2728574d3db7776770924396faf0d84d70 on jehturner:issue_12 into 3ed58dd537e40efed983ca049602af6e3e9f5ce7 on astropy:master.

jehturner commented 6 years ago

This last update made similar changes to clean_medmask as for median filtering, so the edges aren't ignored there either. Since cleaning is part of the iterative process, this also slightly improves the detection of CRs at the edges and gives results close to lacos_spec in IRAF (still not quite identical).

Edge pixels are also being ignored in the Laplace and dilation/growth filtering, but those don't seem to make such a difference to the end result when changed. I think the current Laplace filter is arguably incorrect at the edges of the array, because the central value of 4 should be adjusted to 3 or 2 to account for the number of pixels being subtracted from it, otherwise the gradient measurement is dominated by superimposed signal in those rows/columns. As I say, however, it's the median filtering and cleaning that seem to make most of the difference to CR residuals at edges.