Open jehturner opened 6 years ago
As a quick performance check, timeit
shows that median filtering a 4176x512 pixel image 100 times takes ~12s on my 6-core workstation with both the old and new versions of PyMedFilt5
(for some reason the exact time varies by about 0.5s, but the difference is within the noise). This isn't really a surprise, since the behaviour only differs for 4 rows/columns out of 4176 or 512.
Here's a version that reimplements the 5x5 median filter using an NxN median filter. From a quick check, it seems to be about 8% slower, presumably because the compiler can't optimize the inner loops as per your comment in the existing code -- but maybe it's useful to have an NxN filter anyway and I'll push it so you can have a look. Would you rather keep the triplicated cut-and-paste versions for performance? It's probably going to be quite fiddly to abstract them without some run-time penalty (eg. using the pre-processor or generated code?).
This last update made similar changes to clean_medmask
as for median filtering, so the edges aren't ignored there either. Since cleaning is part of the iterative process, this also slightly improves the detection of CRs at the edges and gives results close to lacos_spec
in IRAF (still not quite identical).
Edge pixels are also being ignored in the Laplace and dilation/growth filtering, but those don't seem to make such a difference to the end result when changed. I think the current Laplace filter is arguably incorrect at the edges of the array, because the central value of 4 should be adjusted to 3 or 2 to account for the number of pixels being subtracted from it, otherwise the gradient measurement is dominated by superimposed signal in those rows/columns. As I say, however, it's the median filtering and cleaning that seem to make most of the difference to CR residuals at edges.
As per #12, here's a version of
PyMedFilt5
that avoids edge residuals, behaving the same as IRAF'smedian
with its defaultboundary="nearest"
andndimage.median_filter
with the equivalent option.It would be far cleaner to factor this into its own function that each median filter can call, but I notice that you haven't done that elsewhere, presumably for optimization reasons? It would save quite a lot of repetition here though -- what do you think? Also, would you prefer the edge filtering to be optional, since you say you have encountered artifacts associated with filtering at the edges? I'd suggest giving this method a try first, since I don't see obvious problems with it (even if the theory is a bit questionable).
I should perhaps move all the for loops into a single block and avoid re-allocating
medarr
...