Master background for IFU data needs to cover same wavelength range as the science data

stscijgbot-jp commented 9 months ago

Issue JP-3523 was created on JIRA by Jane Morrison:

PID 1717 showed a problem with master background subtraction. In the data the background observation failed which is the underlying reason the data failed to process. However the master background step should be "smart" enough to know that the background did not contain the full wavelength range of the science data. In this particular case the master background did not contain CH 1 A or CH 1 B. The science data did contain these bands. When the master background was subtracted for these bands - nans were subtracted resulting in no valid data on the detector when cube_build later ran. A plot of the master background used on the data is attached to this ticket.

After subtracting this master background the image of the data is also attached to this ticket showing the NAN regions plotted in green.

This ticket has been opened to discuss what should be done for the master background step logic when the full wavelength range in the science data is not covered in the background images.

stscijgbot-jp commented 9 months ago

Comment by David Law on JIRA:

My feeling is that a good default solution would be for the background to be zero in such cases. I.e., master background subtraction does as good a job as it can, but neither makes the data useless (with NaN results) nor extrapolates (dangerous). This will result in step-function results, but in this case that's a clear warning in the data that something is up.

stscijgbot-jp commented 7 months ago

Comment by Jane Morrison on JIRA:

David Law I am starting work on this ticket this sprint. If the master background is set to 0 and we do not have a background (for whatever reason) we will get a step function. We can add a warning that the background did not cover the full range (but those are often overlooked). So the extracted spectra will have a step function. I still think people are not going to understand why that is. Are you sure a step function is better than not subtracting a background if it does not cover the wavelength range of the data?

I just want to double check before I code something

stscijgbot-jp commented 6 months ago

Comment by Jane Morrison on JIRA:

David Law Starting work on this ticket. I assume you think the step function background is acceptable. If you have thought of new ideas post them in the ticket.

stscijgbot-jp commented 6 months ago

Comment by David Law on JIRA:

Jane Morrison Just coming back to this, triggered by a calibration program that we're in the middle of putting together which will also trigger this crash. (Cal program to monitor the background with MRS+IMA, and will likely have all three bands of 'SCI' observations but only one 'BG' observation, even though they're all strictly backgrounds...)

This should be a really rare case (APT errors if trying to do this), so I think the step function background is ok. There should already be a MAST version with no background subtraction as there are both observation-based and asn-based versions of the reductions. If anyone gets confused they can file a helpdesk ticket.

stscijgbot-jp commented 5 months ago

Comment by Jane Morrison on JIRA:

Howard Bond David Law

his is proving tricker than I thought to resolve. The program 1717 was were we first noticed this problem. It is a good data set because the science data covers the entire MRS wavelength range but the background only is for band LONG (both IFUSHORT and IFULONG). The code is designed to test the min, max of the background and science data. If the science data outside the min, max then the DQ flag is assign a DO_NOT_USE. The really tricky part is when the science data falls between the min and max of the master background but in a region where NO backgrounds were observed. I think this is a MIRI unique problem because of the 2 channels on each detector - but they cover different parts of the wavelength region. For example final master background (resulting from saving background in the master background step) in the column for wavelength it has a set values corresponding to wavelengths for 1C, then 2C, 3C, 4C. The main work of this ticket is for the case where the science data has a wavelength between the in and max of background but it falls in a band that is not covered by master background so the background is interpolated between data - which is a huge interpolation, that is currently being done in the pipeline and not informing the user that there is no background for this region and just an interpolation between band C in different channels is being done.

To be consistent with how other steps work we should set the pixels to DO_NOT_USE if there is no background and we are trying to background subtract. It would be nice to add another flag to the DQ FLAGS that says why these pixels are DO_NOT_USE (like NO_BACKGROUND). If the user wants to recover the bands that do not have background - then they could run the pipeline and turn off background subtraction, but I do not think we should provide a mix of background subtracted and non-background subtracted data. That would be different than how other steps work. It should be left to the user to not apply the background and if they want to merge the different types of extracted data they could do that off line. I will be posting a pr soon showing my first attempt at what to do Screen shot attached shows what the background looks like (there is only data in the 3 LONG bands).The plotting for the regions where there is no background gives and idea of what the interpolation is doing currently.

stscijgbot-jp commented 5 months ago

Comment by Jane Morrison on JIRA:

Howard Bushouse David Law

I wanted to find out when the data we are trying to background correct is not covered by the master background. I pulled out the left channel and found the wavelength min and max and then tested if that was in the master background. I did the same thing for the right side. Ok that failed miserably because of the overlap in wavelengths between the band. So even though the master background only has band = LONG these is a small overlap of channel 1 B and channel 1 C and channel 2 an and channel 1 c and on and on. Sort of forgot about that.

So my plan now when the master background is getting created in I should read in the bands there and carry that information into when the data is read in (expand_to_2d.py) and read in science data and see if the master background covers that band.

Sound reasonable - or am I making this too complex

stscijgbot-jp commented 5 months ago

Comment by David Law on JIRA:

Hm, it seems like there should be a much simpler way to deal with this than having to worry about all of the complex min/max overlap possibilities. If master background is passed data from only one band, the 1d background built from it will have discontinuous sections. E.g., wavelengths/fluxes for 1C/2C/3C/4C. The 1d vector created by master background (i.e., tab_wavelength and tab_background) doesn't do any interpolation, but when that vector is mapped back to the detector plane for the input science images it does interpolate to the actual wavelength values of the input pixels.

One approach that might work would be a two-pass thing around lines 270-314 of expand_to_2d.py (to keep it to just the IFUs affected, or even just MRS). First interpolate the 1d master background vector to the wavelengths of the 2d science data like we're doing now. Then look for all input science pixels whose wavelengths are more than (say) 0.1 microns away from the closest wavelength in the 1d master background vector, and set the 2d background for those pixels to zero. It'll be a weird jumpy background, but given that this is an edge case of when things go wrong I think that's ok.

stscijgbot-jp commented 4 months ago

Comment by Howard Bushouse on JIRA:

Fixed by #8597

stscijgbot-jp commented 4 months ago

Comment by Howard Bushouse on JIRA:

The final solution adopted in #8597 was to continue to rely on the interpolator from 1-D to 2-D space to set the 2-D background to zero for pixels outside the wavelength range of the 1-D background (which it was already doing) and no longer flag those pixels as DO_NOT_USE, so that they do not get reset to NaN in the science image later on.

Note that this does mean that the background subtraction can have discontinuities in it, and there's currently no indication of that to the user (i.e. they have no way of knowing which pixels got a zero vs. non-zero background subtraction).

stscijgbot-jp commented 1 month ago

Comment by David Law on JIRA:

Behaving as expected, closing.

spacetelescope / jwst

Master background for IFU data needs to cover same wavelength range as the science data #8244