Closed jehturner closed 3 years ago
Regarding this:
- You'd like to get rid of
pssl
and mybgsub
and replace both of them withbkg
, which can accept either a fixed background level (likepssl
) or an array of model values to be subtracted (likebgsub
, mainly for spectroscopy).
There's actually no reason to support subtracting a constant background internally, is there? You were probably just proposing to get rid of pssl
altogether for imaging and require the user to supply indat
with its background still included, unless also passing a noise image.
Does this look like what you had in mind, @cmccully? I have not added a noise argument yet, but this removes the pssl
argument -- instead expecting indat
to have its sky background included -- and renames my bgsub
to bkg
, an optional array that gets subtracted temporarily from the data during detection (I don't see an obvious reason to subtract a constant for imaging but one could).
For completeness, I've had a go at adding a var
parameter that accepts a user-supplied variance array, though perhaps it should go its own PR. Anyway, on reflection, it may be less useful than I had been thinking, given the following considerations:
If bkg
contains all the signal that has been subtracted prior to running astroscrappy
then there isn't really any additional information about the noise in var
.
Like the main data array, I think var
needs its own median filtering and cleaning, to determine what the noise would be in the absence of CRs (which is obviously a bit more processing).
If (unlike the main data array) var
is filtered without subtracting & restoring bkg
, I think that can lead to less accurate noise structure by blurring out fine background details (such as sky lines or IFU fibres).
In the case I've tested so far, there was very little difference in the cleaned result between using var
or just indat
, with only ~0.01% of pixels having significant differences. Those cleaned pixels appear similar but a little noisier when using var
than they do with the original noise algorithm.
I'll push what I did so you can see it. We can always revert that commit if needed...
I'm not returning a noise image here, but I suppose cleanvar
could be added to the return tuple? I tend to re-clean my data after running LA Cosmic anyway, since I find it does a better job of identifying the CRs than removing them (eg. compared with local interpolation), so crmask
seems like the most useful return value.
Rebased as requested.
Not sure what is going on with the first, Python 2.7 build test? It seems to install Python 3.7 with Conda and then fail, but I don't think it's relevant to this PR.
Before answering your detailed suggestions (thanks), I'd like to point out that I was leaning towards reverting my last commit (see comment of 16 April). Input variance was something we discussed on Slack that seemed like a good idea at the time, but once I actually implemented var
, I realized that it doesn't work as well as the original LACosmic noise estimate, for 2 reasons: First, var
has no equivalent of the bkg
image for indat
, so continuum/sky structure gets left in the variance when filtering for cosmic rays, producing noise estimates that are less, rather than more, accurate. Second, bkg
should help ensure that all the signal gets accounted for anyway when producing LACosmic's usual noise estimate, because it includes any "missing" flux that a separate variance array would otherwise tell you about. So I would probably just get rid of var
to avoid overcomplicating things, but obviously if you want to keep it then we can. Admittedly, I'm mostly interested in spectroscopy, where bkg
is going to be essential, whereas for imaging bkg
is more likely to be a scalar and then you might get slightly more accurate noise structure using a var
image as well (off the top of my head). Does that make sense?
In our meeting earlier this week, @cmccully mentioned that he's keen to retain variance input as an option because he has a use case (I think for echelle data) where prior processing produces a non-trivial correspondence betweeen indat
& var
. I've therefore had another look at what's needed to address the limitations I mentioned above on 16 April 2018, but I've had quite a hard time remembering what I was doing here after 3 years...
The good news is that I think the code here already does most of the duplicate processing necessary to use input variance accurately and is "only" missing subtraction/restoration of the background structure (as for indat
).
The problem is simply how to go about subtracting the background from the variance. Unless the only processing done prior to cosmic ray removal is bias subtraction and maybe stacking (in which case you don't really need var
, because the original algorithm can derive the same thing from indat
), the levels of background structure in var
and indat
may differ, even if just by a scaling factor or similar. AstroScrappy doesn't know anything about that, so the caller would be required to provide, say, bkg
and varbkg
separately, which is already getting a bit messy. Moreover, at present the run time for spectroscopy is already dominated by background fitting rather than by detect_cosmics
itself, so if the calling function (equivalent to lacos_spec.cl
) has to fit the sky+continuum for both indat
& var
separately, that's quite a large overhead. Alternatively, the caller might know enough about previous processing to convert bkg
into varbkg
directly, but that might sacrifice generality and/or produce a convoluted API -- I think it would be good to understand your use case a bit better here. And if the caller doesn't get this exactly right, things are liable to go (at least slightly) wrong without it being obvious.
So what are your thoughts on how exactly var
would be used? I think I'm fairly clear on how detect_cosmics
would need to be modified if we know how we want it to look.
Can you rebase this into master @jehturner ?
Rebase from master
? OK.
That was suspiciously easy... Anyway, the diff is just 1 file now.
Merging #32 into master will decrease coverage by
1.93%
. The diff coverage is47.50%
.
@@ Coverage Diff @@
## master #32 +/- ##
==========================================
- Coverage 94.88% 92.94% -1.94%
==========================================
Files 7 7
Lines 1016 1049 +33
Branches 53 53
==========================================
+ Hits 964 975 +11
- Misses 52 74 +22
Impacted Files | Coverage Δ | |
---|---|---|
astroscrappy/astroscrappy.pyx | 69.29% <47.50%> (-5.83%) |
:arrow_down: |
astroscrappy/utils/medutils.c | 100.00% <0.00%> (ø) |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update fd4acfe...0d45f31. Read the comment docs.
This is the same PR as https://github.com/cmccully/lacosmicx/pull/1, rebased on astroscrappy, so we can close that old one. It doesn't yet implement the intended behaviour summarized in that PR, so do not merge yet.