stscijgbot-jp commented 4 years ago

Issue JP-1563 was created on JIRA by David Law:

In testing the master background step, I noticed that the lvl3 association files are set up such that it requires the input background to be provided via the x1d spectrum constructed from individual files in spec2. This will result in significant quality problems because of how extract_1d works.

Specifically, for extended source observations (e.g., dedicated backgrounds) extract1d in spec2 sums the spectrum of the entire data cube and divides by the area to get the effective surface brightness (although see https://jira.stsci.edu/browse/JP-1562 regarding a bug in this computation). Since this is a sum, and not a sigma-clipped mean or median, it is not robust against outlying artifacts such as unfiltered cosmic rays or unexpected sources. As shown in the attached image, such artifacts can cause spikes in the 1d spectrum that would then be replicated to the master background used throughout the data cube.

Similarly, the example attached also shows a falloff at the ends of a band where fewer and fewer IFU slices contribute to a given wavelength. Although spec3 builds data cubes that do not use these wavelength ranges, spec2 does use them since it combines information from both channels into a single spectrum.

The best solution from the perspective of master background subtraction is to use sigma clipping or a median estimator to maximize the quality of the 1d spectrum prior to using it as the master background. However, this probably wouldn't be desirable for extract1d to do in general, as extract1d is more frequently used to get the spectrum of extended sources where you don't want to clip away signal from your object.

The cleanest solution may be for master_background to do its own computation of the 1d background from input 3d background cubes, or even from the _cal.fits background data (i.e., building its own cubes as necessary).

Tagging Anton Koekemoer since the solution to this should involved the Cal WG before moving to the DMS WG.

stscijgbot-jp commented 4 years ago

Comment by Anton Koekemoer on JIRA:

Setting ticket type to "Improvement" to indicate that Cal WG discussion has been requested.

David Law can you let me know please a timeframe by which you might be ready to support a Cal WG meeting with a presentation about the proposed algorithm change and changes needed to the pipeilne steps? (as well as specifying please whether the changes needed are to the baseline or enhanced version of the pipeline steps). Thanks!

stscijgbot-jp commented 4 years ago

Comment by David Law on JIRA:

Any time the week of July 13th or later would be fine; this would be a change to the baseline implementation. Essentially, the question is just whether the master background algorithm should use a straight mean or a sigma-clipped mean/median, which the original description at https://outerspace.stsci.edu/display/JWSTCC/Vanilla+Master+Background+Subtraction was vague on.

stscijgbot-jp commented 4 years ago

Comment by Anton Koekemoer on JIRA:

Thanks very much. Also tagging Karl Gordon since this relates to the baseline description, and we can proceed with scheduling a Cal WG discussion after we've heard his thoughts here first.

stscijgbot-jp commented 4 years ago

Comment by David Law on JIRA:

Adding a note that this also pertains to point source observations too; master background in this case is computed from the 'background' values from an annulus used in extract1d. It looks like this also simply sums flux, rather than providing for rejection of bad/outlying pixel values.

stscijgbot-jp commented 4 years ago

Comment by Karl Gordon on JIRA:

I agree that doing a sigma clipping for the background is the right way to go (for baseline even). How this is implemented I don't have a strong opinion. If the developers can see a way to implement this in extract1d great. If this needs more dedicated code as David Law suggests, that fine too. One solution not yet mentioned would be to add a another extracted spectrum to the extract1d output, the sigma clipped mean. Then the background subtraction step would just need to use the sigma clipped version instead of the straight average version.

stscijgbot-jp commented 4 years ago

Comment by Anton Koekemoer on JIRA:

Scheduled this for inclusion in the discussion at next JWST Cal WG mtg (Tue Jul 28, 11am)

stscijgbot-jp commented 4 years ago

Comment by David Law on JIRA:

Updating this ticket based on the outcome of the 7/28/20 Cal WG discussion. Decision was to:

1) Add a BACKGROUND_SIGCLIP (or some similarly named) column to the Extract1D data model.

2) When Extract1D runs, it should always populate BACKGROUND and BACKGROUND_SIGCLIP with information, regardless of what SRCTYPE or subtract_background setting is in use.

3) For Extended sources that sum all flux in the field of view, BACKGROUND would be the same as SURF_BRIGHT. BACKGROUND_SIGCLIP would be the sigma-clipped (3-sigma suggested as a default) surface brightness across the field (sigma clipping is done across spatial elements at a given wavelength).

4) For Point sources, BACKGROUND would be the computed surface brightness in the background annulus. BACKGROUND_SIGCLIP would be the sigma-clipped surface brightness in that annulus.

5) The Master Background pipeline step would no longer use the SURF_BRIGHT column from the x1d data products, but directly use the BACKGROUND or BACKGROUND_SIGCLIP columns. The choice between which of these two to use depends on instrument mode.

I think that about covers it?

stscijgbot-jp commented 3 years ago

Comment by Jane Morrison on JIRA:

Karl Gordon David Law I am trying to fully understand how the background subtraction and master background subtraction steps are done for each type of data. Also when is extract1d subtracting a background ? The extract reference file can contain subtract_background but after looking though all of them that is only set for MRS IFU and NIRSpec IFU data. For the IFU data it is only subtracted for point source data (no other background subtraction is done - I think). Reading the documentation it seems if we have nodded point source data or extended source data we have a case where the background is in a separate exposure so we need run master background subtraction to subtract the background and not the simple background subtraction. I have been looking at the table defined in the SRCTYPE set https://jwst-pipeline.readthedocs.io/en/stable/jwst/srctype/description.html and I wonder if we could expand on this type of table and fill in what type of background subtraction is done (if any) for each case (default one if no information is in the extract1d ref file and the parameter is not provided to extract1d) . We might need to add a few rows.

For example, what if we have MRS Extended source data with dedicated background. Is the background subtraction step run subtracting image to image if the exposure time of the science and background are the same and if they are not the same is master background subtraction run ?

MIRI LRS fix-slit will have a nodded position so the background is subtracted in master background (correct ?).

What about NRS_FIXEDSLIT is that considered a Nodding type ? It seems the background could be determined from the science data . Looking at the extract1d reference file there are nod values so how is the background handled in this case. Is a master background created ?

I am not trying to make this too hard but if we could fill out the other cases in the SRCTYPE table and if it possible to provide more information on what type of background is subtracted as the default for each case.

If a Telecon is a better place to discuss this just let me know. My schedule is pretty open

stscijgbot-jp commented 3 years ago

Comment by David Law on JIRA:

Jane Morrison Now I think we're really starting to shake down this part of the system, which hasn't been tested as thoroughly as the rest yet. To my mind, there are three possible places that background subtraction can happen:

1) Early in spec2 in 'Background Subtraction'

2) Early in spec3, in 'Master Background' Subtraction'

3) In Extract1D

These all have different purposes.

(1) is for if we want to do simple image-based background subtraction, i.e. subtraction Exposure 2 from Exposure 1, or the median of some set of exposures from some other exposure. I don't think we want to do this is general, as it can be noisy, but we want the option available in case our smarter methods run into trouble.

(2) in my mind is general-purpose background subtraction. For the MRS at least this is where we want background subtraction to be taking place, because it uses information to build a model-based background that can be subtracted from the entire data cube.

(3) is really a bonus case, in that sometimes you want to subtract off an additional annular background for a point source (e.g., if your star is embedded in a nebula, but the nebula wouldn't be taken out by master-background because your dedicated background exposure is off-nebula).

Karl would be better able to speak to the other modes, as I haven't given them much thought.

For the purposes of this ticket though, there are some problems with the current implementation that need to be worked. For instance, right now annular background subtraction (3) is making the extracted spectra MUCH noisier because of how the background is being computed. I think if we can work (3) first, then (2) will be much easier to deal with.

stscijgbot-jp commented 3 years ago

Comment by Karl Gordon on JIRA:

David Law: Great summary. I fully agree with everything you said.

stscijgbot-jp commented 3 years ago

Comment by Jane Morrison on JIRA:

David Law Here are some use cases I could use clarification on.

MRS extended source observations with dedicated background

Is the dedicated background subtraction done in spec2 - image to image or in spec3 with the master background

MRS point source 2-point or 4-point dither with no dedicated background Is the background subtracted in spec3 by combining the background from the annulus region determined calspec2 extract 1d. Or is the background subtracted at the end of calspec3 inside extract 1d using the annulus.

When is the background in the extract1d in the annulus used as the final background to subtract ? The extract1d reference files for the IFU have subtract_background = True. So it seems to me that for point source data the background is subtracted for IFU data using the annulus inside extract1. Right ? So for point source IFU data is the master background step ever done ??

stscijgbot-jp commented 3 years ago

Comment by Karl Gordon on JIRA:

The spec2 background subtraction and spec3 master background subtraction is only done if there are dedicated background observations. If there are none, then the background is subtracted during spectral extraction for a point source observation. There is no background subtraction done during spectral extraction if it is an identified extended source.

At least this is my understanding.

stscijgbot-jp commented 3 years ago

Comment by Karl Gordon on JIRA:

Note that currently if there are dedicated background observations, then the spec2 image based background subtraction is done and the spec3 master background subtraction is never done. I agree with you that this logic should be swapped and the spec3 master background subtraction should be done by default if there are dedicated background observations. The spec2 image based background subtraction is a contingency if the spec3 master background has problems.

Again, this is my understanding. I have not checked the code/config files to confirm.

stscijgbot-jp commented 3 years ago

Comment by Jane Morrison on JIRA:

David Law I want to double check how the sigma clipping should be done for NON-IFU data. For IFU data a module called ifu.py is called and extracting the data in done in this step and I think I have sigma clipping understood for this case and I will get that coded up for review.

I think most of the NON-IFU data run through the extract_1d_step are data with a single or multi slit data (and the background is determined from nodded position or off source position defined in extract 1d reference file). But I think the input image contains both source and background regions. I am thinking MIRI LRS, NIRSPEC Fixed slit and MSA data (not sure about WFSS data) . For the slit data extract.py calls 'extract_one_slit' , which calls 'extract_model.extract', which in turn calls 'extract1d.py'

On line 195 of extract1d.py defines the background as:

Smooth the input image, and use the smoothed image for extracting

the background. temp_image is only needed for background data.

if nbkglim > 0 and smoothing_length > 1: temp_image = bxcar(image, smoothing_length) else: temp_image = image

Now 'image' is a 2d image. The array may have been transposed so that the dispersion direction is the second index. As I understand it the image could contains sources. So sigma clipping the image may not be useful. 'temp_image' is used to define the 'bkg_model' so I think in extract1d.py._fit_background_model, the sigma clipping should be done after y, val, wht = _extract_colpix(image, x, j, bkglim)

However, the background region is from the original input data in which is determined in '_extract_src_flux'

So again for the background I think we need to do sigma clipping again after

y, val, area = _extract_colpix(image, x, j, srclim) in '_extract_src_flux'

Does that seem correct ?

stscijgbot-jp commented 3 years ago

Comment by Howard Bushouse on JIRA:

Doing smoothing on the dispersed 2D image for slit-like modes (MIR_LRS, NRS_FIXEDSLIT, NRS_MSASPEC, etc.) is very different than doing a sigma-clipped mean/sum of the background region in each wavelength slice of an IFU cube. The wavelength slice of an IFU cube is essentially an image, not a spectrum, so the clipping is taking place in the spatial domain.

For slit-like modes, the smoothing referred to above is smoothing in the spectral domain, specifically along the dispersion direction (i.e. smoothing across wavelengths). For these slit-like modes, a fit is performed in the cross-dispersion direction in each wavelength bin (image column, if DISPAXIS=1) using the pixels included in the defined background regions. The "fit" can be either a simple mean, median, or a polynomial fit of a chosen order. Selecting the median option would achieve some rejection of outliers, but that's only for the pixels contained within a single wavelength bin. The background in each wavelength bin (image column) is fit independently. If there happens to be a contaminating source in one of the background regions (e.g. if backgrounds are defined on either side of the target spectrum) a median or sigma-clipped median might be able to remove it. But the point is that this clipping would have to take place in the cross-dispersion direction, for the computation of the background in each wavelength bin. The smoothing is being done along the dispersion direction and hence won't do anything to remove contaminating sources. It's primarily meant to just smooth out noise and hence increase the SNR of the final extracted background spectrum.

stscijgbot-jp commented 3 years ago

Comment by Jane Morrison on JIRA:

My plan for non-if data was to do sigma clipping at each wavelength within the extracted limits.

Howard Bushouse if image is a 2d image that has been oriented to dispersion is the second index. Then if I wanted to sigma clipping in the cross dispersion dimension I think I just have to loop over the first index pull out a row and sigma clip. I think I need to run this through some data I understand and see if I am grabbing the correct dimension that is cross-dispersion (the python y,x and documentation saying second index is messing me up which is the cross dispersion dimension (x or y) for image.

stscijgbot-jp commented 3 years ago

Comment by David Law on JIRA:

Jane Morrison good questions; now that Extract1D is generally doing what we want we need to square the originally intended MRS background subtraction approach with the as-implemented (NIRSpec) master background algorithm. I think where part of the confusion comes from is that we'd originally thought about making Master Background do something complicated with making a 3d model of the background in AB nod pairs, and subtracting the A background from B, and vice versa. This is described a little on the pipeline readthedocs page.

I've argued myself in circles on the topic, but in summary I think it's cleanest if (as Karl already said) we ONLY do master background subtraction when there is a dedicated bg to use.

The main reason for this in my mind is how small the MRS FOV is compared to the size of the point source. Even in a nodded exposure, at long wavelengths especially there will still be a couple of percent from the source in the bg region. Doing only dedicated bg subtraction in master_background means that this step will never subtract real source flux from the scene, so that all of the logic about that is self-contained in Extract1D (which could be rerun with different source types/locations as desired without affecting the required aperture correction).

Any future possible updates to the sky modeling algorithm (e.g., fitting a slowly varying function of wavelength to the background to use that instead of the actual background values to beat down noise) would still only have to be implemented in one place (since the first run of Extract1D computes the BACKGROUND_SIGCLIP used by Master Background).

Here then are some MRS example cases, using methods 1 (spec2 pixel-based), 2 (spec3 model-based), 3 (x1d annular) from the note above.

SRCTYPE=EXTENDED, no dedicated bg.

We do no background subtraction anywhere.

SRCTYPE=EXTENDED, dedicated bg.

Method (2) using the dedicated bg pointings.

SRCTYPE=POINT, no dedicated bg.

Method (3)

SRCTYPE=POINT, dedicated bg

Method (2) and (3)

I.e., the logic should be to never do (1) unless our cleverness fails us, to do (2) any time there is a dedicated background, and to do (3) any time SRCTYPE=POINT.

stscijgbot-jp commented 3 years ago

Comment by Jane Morrison on JIRA:

David Law I wrote up a similar table today trying to define how to do MRS extended and point source data with and without dedicated background exposures. One difficulty is that if we have dedicated background exposures for MRS how do we tell calspec 2 to SKIP method (1) "unless our cleverness fails us. " Is the idea now to run that "off line" forcing the pixel-based subtraction - if it is needed ? There must be other types of spectroscopic data that would do an image to image subtraction in calspec 2 - so would we just want to disable calspec2 background subtraction for MRS data ?

Could you clarify something for me - is there a dither pattern for the MRS that is a NOD ? If so even for that type of observation we would need a dedicated background to perform master background. If there is not dedicated background, the default would be to have extract 1d subtract the background using background determined in annulus ?

What about the LRS and NIRSpec Nodding - do we leave them as doing master background subtraction (I assume yes). This is "making Master Background do something complicated with making a 3d model of the background in AB nod pairs, and subtracting the A background from B, and vice versa." As you can see from Howard's comments above it might be tricky to do sigma clipping in the background for regular 'slit' type exposures. Karl Gordon could you comment on what you recommend for NODDING type exposures and the master background subtraction for data that is not MIRI MRS.

stscijgbot-jp commented 3 years ago

Comment by Howard Bushouse on JIRA:

Sorry, I'm coming late to this party, so my responses are going to be non-chronological relative to previous comments.

08/Feb/21 12:08PM Karl Gordon "The spec2 background subtraction and spec3 master background subtraction is only done if there are dedicated background observations. If there are none, then the background is subtracted during spectral extraction for a point source observation. There is no background subtraction done during spectral extraction if it is an identified extended source."

_For any mode (MIRI MRS, MIRI LRS fixedslit, NRS fixedslit, NRS IFU) that allows for a nodded dither pattern for point sources, the spec2 subtraction will be used (currently). So spec2 is used whenever 1) there are dedicated bkg exposures, or 2) there are nodded point-source exposures. For the latter, the image-to-image subtraction results in positive and negative spectral traces in the data, where only the positive trace is used later on during extract_1d. In both cases the user has the option to reprocess and use spec3 master bkg subtraction instead. For case 1, the 1D bkg spectra used as input to mbkg come from extract_1d "source" extraction results of the dedicated bkg exposures. For case 2, the 1D bkg spectra used as input to mbkg come from the "background" extraction results of the nodded point-source exposures. In all cases, the option to do yet another round of bkg subtraction when extract1d is run on the combined product at the end of spec3 is available.

stscijgbot-jp commented 3 years ago

Comment by Howard Bushouse on JIRA:

08/Feb/21 12:11PM Karl Gordon "Note that currently if there are dedicated background observations, then the spec2 image based background subtraction is done and the spec3 master background subtraction is never done. ... Again, this is my understanding. I have not checked the code/config files to confirm."

You are correct, sir. Consider this your confirmation.

"I agree with you that this logic should be swapped and the spec3 master background subtraction should be done by default if there are dedicated background observations."

No changing of baseline behaviors without Cal WG approval! ;)

stscijgbot-jp commented 3 years ago

Comment by Howard Bushouse on JIRA:

08/Feb/21 05:40PM Jane Morrison "My plan for non-if data was to do sigma clipping at each wavelength within the extracted limits."

_This operation could potentially be added into the existing lower-level functions that are already doing the bkg fitting in each column (or row, depending on dispersion direction), for example in fit_background_model, after the function extractcolpix() has been called. This is already operating on a column-by-column (or row-by-row) basis and gives you just the pixels contained in the bkg region(s) for each wavelength bin. The sigma clipping could be done on that list of pixel values, before they are then fed into the "fitting" function (I use quotes around "fitting", because sometimes the computation is just a simple mean or median, rather than an actual fit). Having the clipping done at that point would then remove bias from the bkg fit at each wavelength.

Of course the specs provided in a very early comment above indicated that the clipped bkg extraction was to be in addition to the current bkg extraction results, so I guess any code changes would have to allow for the clipping to be optional and the whole bkg fitting/extraction process would have to be called twice: once with clipping and once without. On the other hand, since these are dedicated bkg regions we're talking about, which only get used when it makes sense to do so (i.e. extracting point sources that have regions of bkg only signal somewhere in the image), is it really necessary to provide two forms of the bkg extraction? What's the purpose of the unclipped version? David Law Karl Gordon what am I missing here?

stscijgbot-jp commented 3 years ago

Comment by Jane Morrison on JIRA:

For the purposes of this ticket should it be limited to sigma clipping the background of IFU data ?

I think that will be the biggest benefit. I am not sure a sigma clipped background in the small region defined in other types of slit type data will be a big change. I am still playing around with some example slits so I could be wrong here. But if we only do this for IFU data then only that data will have the extra column in the spectral table BACKGROUND_SIGCLIP . Comments David Law Karl Gordon Howard Bushouse

stscijgbot-jp commented 3 years ago

Comment by David Law on JIRA:

Thanks Howard Bushouse for the clarification. Agreed that we shouldn't mix too many things together in one thread, and focus here on implementing the sigma clipping and necessary modifications to Master Background to use that sigma-clipped value.

Jane Morrison I don't remember why the Cal WG settled on keeping BACKGROUND as it is and adding a new BACKGROUND_SIGCLIP, from my point of view I agree that it might be clearer to simply do clipping rather than have a clipped and an unclipped version. Perhaps Karl Gordon or Anton Koekemoer remembers why we wanted to keep around an unclipped BACKGROUND?

stscijgbot-jp commented 3 years ago

Comment by Howard Bushouse on JIRA:

I agree with Jane Morrison that we should confine the work in this ticket to make necessary enhancements to the handling of IFU modes only, which seems to be relatively straightforward. Of course it would be good to also get confirmation on whether we really need to add the new clipped bkg column to the x1d products before we just barrel ahead with implementation.

And if you really do want the default processing behavior changed for IFU modes, such that the default is master bkg subtraction, instead of calspec2 image-to-image subtraction, then we'll also need some minor updates to calspec2 logic. It could also be handled by changing the level-2b ASN rules, such that the level-2b ASN files created for IFU exposures no longer list the dedicated bkg or nodded exposures as "background" members, in which case the calspec2 background step will automatically get skipped, but I think it could be useful to users to still list those background members in the ASN files, just in case they want to do custom processing off-line. Hence it would require a logic change in calspec2 to skip the background step. It could also be done via a custom calspec2 pars ref file in CRDS for NRS_IFU and MIR_MRS that sets the background step to be skipped. That's perhaps even cleaner yet.

stscijgbot-jp commented 3 years ago

Comment by Anton Koekemoer on JIRA:

David Law on the question about also having an unclipped background, this was discussed (and motivated, also decided) at the JWST Cal WG meeting 2020-07-28 (for this ticket actually) - here's a brief summary from the meeting notes, of the portion relevant to the question of using the clipped background:

Which instrument modes would use this? [clipped background]

MIRI IFU, NIRSpec IFU, possibly NIRSpec Slit (as summarized by James Muzerolle)

other spectroscopic modes would not use this and would continue with the current (non-clipped) background: WFSS (NIRISS and NIRCam) due to potential issues in the clipping for distorted data, and NIRSpec MSA due to potentially not enough background pixels to do robust clipping.

So in summary, WFSS (NIRISS and NIRCam) and NIRSpec MSA modes may have potential issues using the clipped background, hence this motivates retaining the unclipped background for those instruments/ modes.

Hope this helps.

Also, on the point raised by Karl Gordon above:

I agree with you that this logic should be swapped and the spec3 master background subtraction should be done by default if there are dedicated background observations. The spec2 image based background subtraction is a contingency if the spec3 master background has problems.

I'd be glad to schedule a Cal WG meeting to discuss this if needed and provide a venue for approving it - can you confirm please if you feel this is necessary?

stscijgbot-jp commented 3 years ago

Comment by David Law on JIRA:

Great, thanks Anton Koekemoer ! There's our reason for having multiple kinds of BACKGROUND.

Howard Bushouse I like the suggestion of a custom param ref file, potentially just for MRS for now. That makes it trivial to turn the step on/off as necessary during commissioning until we know what actually works out for the best.

stscijgbot-jp commented 3 years ago

Comment by Howard Bushouse on JIRA:

After reviewing the Cal WG meeting notes and thinking about it some more, I'm not sure the argument that storing unclipped or clipped data in the existing BACKGROUND column is entirely valid or sufficient to warrant modifying the table structure. For example, even with no changes at all to the current pipeline code, the user has the option to select a number of different background extraction methods (at least when operating on non-IFU data), which include mean, median, and polynomial fits to the bkg data in each wavelength channel. Simply switching from mean to median is essentially the same as switching from unclipped to clipped, and yet those data are stored in the same table column, and it's left to the user to look at the processing history to understand what kind of method was used. Even using one of the polynomial fitting options is effectively applying a certain amount of clipping, because the fit will tend to lessen the influence of outliers.

Furthermore, I've just confirmed that modifying the definition of the x1d table contents (in our datamodels schema) will in fact render all existing x1d products unreadable. I just know that all of the ERS and commissioning teams are going to love that!

So, IMHO, I think that the scheme of adding new table columns warrants a rethink. It's very difficult to completely assess the full set of pros and cons of a given scheme during the few minutes available in a Cal WG presentation. It's often the case - as here - that the full impacts aren't appreciated until the implementation phase.

stscijgbot-jp commented 3 years ago

Comment by Howard Bushouse on JIRA:

For the case where you have dedicated background exposures (which it sounds like will almost always be the case for MIRI MRS, due to the issues outlined above by David Law ), the dedicated bkg exposures get treated as extended sources, we could modify the Cal WG decision "3) For Extended sources that sum all flux in the field of view, BACKGROUND would be the same as SURF_BRIGHT" such that the unclipped source region data are used to populate both FLUX and SURF_BRIGHT, while the clipped version of the same data are used to populate the BACKGROUND column. In all cases the master bkg step would use the data from the BACKGROUND column, which then for MIRI MRS with dedicated bkg's (and perhaps NIRSpec IFU too) means you'd be using the clipped background.

This allows for both versions (unclipped and clipped) to be stored in the x1d table without any format changes.

Thoughts?

stscijgbot-jp commented 3 years ago

Comment by Jane Morrison on JIRA:

(y)

Makes sense to me

stscijgbot-jp commented 3 years ago

Comment by David Law on JIRA:

Hm, good to know that new columns would be so problematic.

Since the difference really seems to be between IFU and non-IFU modes perhaps the calculation in ifu.py could simply apply clipping when calculating BACKGROUND and the other modes do not (since this code is separate anyway)? That way we wouldn't need to be concerned about spikes in the annular region if the code is run on point sources either, and it wouldn't affect other instruments.

stscijgbot-jp commented 3 years ago

Comment by Jane Morrison on JIRA:

In the extract_1d step the code ifu.py has an option for the user to supply an image reference file to do the extraction. This is not used in practice and is in the code for flexibility (at least I think). In the comments to this function: image_extract_ifu : ".... we assume that if the user specified a reference file in image format, the user actually wanted that extract1d reference file to be used.... The IMAGE extract1d reference file should have pixels with avalue of 1 for the source extraction region, 0 for pixels not to include in source or background, and -1 for the background region."

I am assuming if such a file was created some effort went into defining the background region so there is no need to perform sigma clipping of the background.

If you disagree - let me know.

stscijgbot-jp commented 3 years ago

Comment by Howard Bushouse on JIRA:

That image-based reference file was originally created to help with the complicated spectral traces of some modes like NIRISS SOSS, where a simple rectangular extraction region for either source or background is not sufficient. It looks like the concept was carried over to allow for its use with IFU data as well. But just because the background region was defined in a careful way (e.g. some custom shape that mimics a circle or annulus or whatever) does not necessarily imply, IMHO, that it was defined to avoid contamination by sources, and hence I would think you would still want to perform sigma clipping on the data that are being integrated within the background region.

stscijgbot-jp commented 3 years ago

Comment by Jane Morrison on JIRA:

Howard Bushouse ok I will add sigma clipping to that case as well.

In the master background step I believe we always now want to use the "BACKGROUND" column in the extract1d table. The code is set up to use the "SURF_BRIGHT" column. For nodded exposures the BACKGROUND is copied to SURF_BRIGHT column and then surf_bright is used. I can update the code to now only use BACKGROUND and change the array names that create the master background to "background" just so it is clear what is going on in master background. But before I do that I wanted Howard Bushouse to comment. I don't want to forget some case - but I think all the data needed for the master background is now in the BACKGROUND column of the extract 1d table.

stscijgbot-jp commented 3 years ago

Comment by Jane Morrison on JIRA:

Ok I have edited the original comment. I got myself tripped up on the reference image and when source = 'EXTENDED". I think the main usage of the ref_image is for point source data. But if it is used for extended source data then I think we extract the entire region ignoring the ref_image mask and do sigma clipping of the entire region.

Howard Bushouse does that seem correct to you ? Or maybe we extract pixels defined as either source (1) or background(-1) and ignore pixels set to 0

stscijgbot-jp commented 3 years ago

Comment by Howard Bushouse on JIRA:

Jane Morrison I believe you're correct about switching the master bkg step to always use data from the BACKGROUND column and hence don't bother to copy from BACKGROUND to SURF_BRIGHT anymore. It used to use SURF_BRIGHT for dedicated bkg exposures (because they didn't have a background region - it's all "source") and for ease of use in the lower-level routines, the non-dedicated bkgs had their BACKGROUND data copied to that table column. So all of that can go away now, which is nice (always get background data from the BACKGROUND column - what a concept!).

Not sure what do in the case of a ref_image that has both source and bkg regions defined, but it's being applied to an extended source. It also depends on whether we're talking about doing extractions for "normal" processing, i.e. a science target image, or are we talking about how to handle extractions for dedicated bkg exposures, which by definition are treated as EXTENDED sources? For normal science exposures (i.e. BKGTARG=False), we should just follow the specs of the ref_image, using the appropriate pixels for source and background. But then what do we do for BKGTARG=True exposures, which have SRCTYPE=EXTENDED? All we want/need from these is background extraction, but do we still only use the pixels flagged as background (-1) in the ref_image, or do we use the pixels flagged as source (+1), assuming that most or all of the image only contains bkg signal to begin with? What approach do we use when we're not using a ref_image and BKGTARG=True? Do we ignore specifications given in the extract_1d ref file and just integrated over the entire image? If so, then perhaps we should take the same approach when we've been given a ref_image, i.e. ignore the values in the ref_image and just integrated over the whole image, storing the "raw" results in the FLUX and SURF_BRIGTH columns and computing a sigma-clipped result to store in the BACKGROUND column.

stscijgbot-jp commented 3 years ago

Comment by Jane Morrison on JIRA:

Howard Bushouse I realized why I got a bit confused on what to do in the SRCTYPE=EXTENDED and the ref_image case. I re-read the comments at the beginning of the routine 'image_extract_ifu"

"One of the requirements for this step is that for an extended target, the entire aperture is supposed to be extracted (with no background subtraction). It doesn't make any sense to use an image extract1d reference file to extract the entire aperture; a trivially simple JSON extract1d reference file would do. Therefore, we assume that if the user specified a reference file in image format, the user actually wanted that extract1d reference file to be used, so we will ignore the requirement in this case. "

But I am not sure ignoring the requirement is the best option.After searching a bit I did find information on the "Reference Image Format'. It is listed under the Extract1d section but I missed it before because there is no link on page describing the "Reference File" (at least not one I saw) . To make it clear to the user I think we need to add to the page on the " Reference Image Format" what this file means for the EXTENDED case. Maybe people what to use it to exclude certain regions from the EXTENDED case. It seems incorrect to use the reference file that people would use for POINT source data on the EXTENDED case. But a special reference file was defined for EXTENDED source then having this option would allow people flexibility. What do you think ?

stscijgbot-jp commented 3 years ago

Comment by Howard Bushouse on JIRA:

OK, you've convinced me. If a ref_image is specified, then it should be used as-is, regardless of SRCTYPE. Extract the source and background regions as defined by the ref_image, and apply sigma clipping to the background region data. I honestly can't imagine why anyone would ever not want to apply sigma clipping to the background. Using the ref_image in this way doesn't have any direct impact on the changes we're making for sigma clipping the background for IFU data, because at least as of right now the default methods never use the ref_image (we always use the aperture defs in the extract1d json or fits ref file).

stscijgbot-jp commented 3 years ago

Comment by Howard Bushouse on JIRA:

Fixed by #5743

stscijgbot-jp commented 3 years ago

Comment by Jane Morrison on JIRA:

Assigning back to David Law for testing

stscijgbot-jp commented 3 years ago

Comment by Howard Bushouse on JIRA:

To summarize what's been implemented so far for inclusion in B7.7.1:

Extract_1d has been updated to continue extracting "source" region data from IFU exposures as it did before, for any SRCTYPE, using the "raw" unclipped data in the IFU cube. These results are stored in the FLUX and SURF_BRIGHT columns of the output x1d table. What's been added is that when SRCTYPE='EXTENDED' it does a 2nd extraction of the same "source" region, but this time using sigma-clipped data from that region and stores those results in the BACKGROUND column of the x1d file.

The master background step has been updated to read data from the x1d BACKGROUND column when processing IFU exposures that have SRCTYPE='EXTENDED', so that it uses the sigma-clipped data.

Nothing has changed for any other spectroscopic modes (e.g. MIRI LRS, NIRSpec MOS or FS, etc.)

A consequence of 1) is that the x1d products for IFU extended sources, will always have data in the BACKGROUND column, regardless of whether they are dedicated bkg exposures or not (i.e. they could be exposures of an extended science target). In this situation, the data in the FLUX/SURF_BRIGHT columns come from unclipped data, while the BACKGROUND data is sigma-clipped.

Karl Gordon David Law we're curious to hear if you think it will be confusing for users to see data in the BACKGROUND column of x1d products for normal extended source science targets. Because it's an extended source, there's no way to measure the actual background and hence they may be concerned where that information came from and whether or not it was subtracted from the source data. If this seems like it would be confusing, an extra hook could be added into the new extract_1d routines for IFU, such that the sigma-clipped background is only computed and populated when BKGDTARG=True. That way normal extended science targets won't have the BACKGROUND column populated.

stscijgbot-jp commented 3 years ago

Comment by David Law on JIRA:

Thanks for the summary Howard Bushouse I'll try to test the code itself as soon as I get a chance, but some abstract musing in the meantime.

It sounds like a lot of the potential confusion is a consequence of our outsourcing the work of master_background to Extract1D at an earlier stage in the pipeline. In many cases of very-extended objects BACKGROUND computed in this way will be totally bogus, but for only marginally-extended sources the sigma-clipped BACKGROUND could still be pretty reasonable. I'm inclined to keep it for now for all extended source data, at least for testing purposes during commissioning. With everything fresh in mind, can we add in a commented-out toggle such that it's easy to flip this switch (to only populating BACKGROUND if BKGDTARG=True) if necessary at a later date?

stscijgbot-jp commented 3 years ago

Comment by Karl Gordon on JIRA:

I agree with David Law. I think it would be good to not populate the background column for extended sources as you indicated would be an option Howard Bushouse.

stscijgbot-jp commented 3 years ago

Comment by David Law on JIRA:

Sigma clipping looks like it's now being applied as intended. Closing this ticket accordingly- if any future tests find problems in the usage of that background they can open new tickets as necessary.

spacetelescope / jwst

Master background uses incorrect surface brightness estimation #5118

Smooth the input image, and use the smoothed image for extracting

the background. temp_image is only needed for background data.

The master background step has been updated to read data from the x1d BACKGROUND column when processing IFU exposures that have SRCTYPE='EXTENDED', so that it uses the sigma-clipped data.

Nothing has changed for any other spectroscopic modes (e.g. MIRI LRS, NIRSpec MOS or FS, etc.)