Spec2Pipeline LRS processing: Extract1d does not produce error column (all 0) in *x1d.fits file

stscijgbot-jp commented 4 years ago

Issue JP-1737 was created on JIRA by Juergen Schreiber:

I ran Spec2Pipeline skipping the Photom step to get the 'raw' extracted spectrum.

The error column in the resulting x1d file stays at 0 although it would be easy to calculate: just take the standard deviation of the sum of each pixel row of the aperture.

tapastro commented 3 years ago

Thanks for getting back so quickly! That seems like a reasonable approach - I'll save the full variance propagation for a later date.

drlaw1558 commented 2 years ago

Might also be related to JP-2293

stscijgbot-jp commented 2 years ago

Comment by Alicia Canipe on JIRA:

Anton Koekemoer Howard Bushouse According to the documentation: "ERROR, SB_ERROR, BERROR, and DQ are not populated with useful values yet." (from https://jwst-pipeline.readthedocs.io/en/latest/jwst/extract_1d/description.html#input)

I didn't see any specifications for the error arrays in the vanilla spec extraction page, so it sounds like we just haven't implemented an error calculation yet? I don't remember what the discussion was about this.

stscijgbot-jp commented 2 years ago

Comment by Juergen Schreiber on JIRA:

Thanks for the answer, I saw it now, too: that is right, it is correctly documented that there is no error calculation.

I indicated this ticket as "improvement", since I think it is easy in this case to calculate the error just as standard deviation, in version 0.16.2 there is no aperture correction yet, whereas in the latest version there is one, so that you just have to calculate an error propagation.

stscijgbot-jp commented 2 years ago

Comment by Alicia Canipe on JIRA:

Thanks, Juergen Schreiber. I'll go ahead and assign this to Anton Koekemoer for the CalWG to discuss.

stscijgbot-jp commented 2 years ago

Comment by Anton Koekemoer on JIRA:

thanks Alicia Canipe and Juergen Schreiber for the comments, I've now scheduled this for discussion at the upcoming JWST Cal WG meeting 2020-10-13

stscijgbot-jp commented 2 years ago

Comment by Anton Koekemoer on JIRA:

This was discussed in JWST Cal WG meeting 2020-10-27, with the following notes:

James Davies pointed out that this ticket is specifically for resampled data (MIRI LRS), while error calculation for unresampled data (including spectroscopic and imaging data) are in !https://jira.stsci.edu/secure/viewavatar?size=xsmall&avatarId=13003&avatarType=issuetype! JP-1611 - Error arrays in source catalog OPEN (but note that ticket currently is focussed on imaging data)
David Law reported that the LRS error array ticket still needs to be discussed within MIRI and that there's not yet convergence on an algorithm for it.
It was confirmed that the x1d error array issue also needs to be resolved for the spectroscopy modes for the other instruments, ie NIRSpec, WFSS (NIRISS and NIRcam), also TSO and SOSS.
Consensus was that, while not critical to have in the DMS operational pipeline by launch, it would be a high priority to have available in an offline version of the pipeline both for commissioning and for science.
For SOSS, an approach could be to look at the residual maps from the model fitting (although these would also be subject to imperfections in the model)
Loic Albert will verify whether or not caldetector1 currently produces what he needs
James Davies will proceed with separating this issue into the various subcategories for the relevant modes (resampled vs unresampled, and spectroscopic vs imaging), and will include the following people in the discussion (if anyone else is interested in participating, please let him know): Nestor Espinoza Michael Regan David Law Loic Albert Howard Bushouse Kevin Volk (note that Kevin Volk may ask Swara Ravindranath to join this discussion instead).

stscijgbot-jp commented 2 years ago

Comment by Alicia Canipe on JIRA:

A comment for the ticket watchers – please let us know the priority level for this ticket (e.g., critical, high, medium, low). If it is critical, please indicate whether it is critical to have this in time for commissioning and/or science operations and why.

stscijgbot-jp commented 2 years ago

Comment by Juergen Schreiber on JIRA:

Well, I need to apply extract1d for LRS commissioning analysis to determine the PHOTOM CRDS calibration (CAP 302). It would be good to have also an uncertainty estimation as a sanity check, go/no go criterium for the measurement, and to better compare to older PHOTOM files.

At least for the flux determination in science operation I would judge this ticket as critical, for commissioning medium to high.

stscijgbot-jp commented 2 years ago

Comment by Sarah Kendrew on JIRA:

I concur with Juergen Schreiber's assessment. James Davies [X] as per email, please include me in the conversation around this issue for LRS.

Update: please also include Greg Sloan in any related discussions, he will be taking this on for MIRI LRS.

stscijgbot-jp commented 2 years ago

Comment by Karl Gordon on JIRA:

Please include me in these discussions as I am very interested.

stscijgbot-jp commented 2 years ago

Comment by Howard Bushouse on JIRA:

JP-1944 outlines the current plans for creating error and variance arrays in resampled data, including resampled spectra. The new arrays will be the equivalent of the various error and variance arrays that already exist in unresampled data products. JP-1611 outlines the way the resampled errors will be propagated through source extraction for imaging data. So now that we know what will be available in both unresampled and resampled spectral data, the group of individuals concerned with this ticket needs to come up with a plan for propagating the error/variance information from 2D spectra to 1D spectra, during the extract_1d process. Do the interested parties favor having a meeting to discuss this or will a dialogue via ticket comments suffice?

stscijgbot-jp commented 2 years ago

Comment by Sarah Kendrew on JIRA:

yes I think that is a good idea Howard Bushouse - for MIRI please include Greg Sloan, Karl Gordon and Juergen Schreiber. We are keen to move this work forward if possible.

stscijgbot-jp commented 2 years ago

Comment by Alicia Canipe on JIRA:

Howard Bushouse should we lump JP-1921 into the discussion (how to propagate DQ values to extract_1d products)? Based on the comments, it seems like the work should be considered together, e.g., "So perhaps this is motivation for putting some kind of DQ flag in the x1d column for any wavelength bins that've had pixels excluded, because the resulting flux is essentially unusable."

If that's too much for one meeting I can just schedule a follow-up specifically for how to handle the DQ values.

stscijgbot-jp commented 2 years ago

Comment by Howard Bushouse on JIRA:

Results of meeting to discuss propagation of errors to 1D spectroscopic products are recorded at https://outerspace.stsci.edu/display/JWSTCC/JWST+Pipeline+Uncertainty+Propagation

To summarize, the extract_1d algorithms need to be updated to:

create 1D extracted variance arrays from each of the input 2D VAR_POISSON, VAR_RNOISE, and VAR_FLAT arrays, by summing over the same list of input pixels as used to compute the extracted flux, including any weighting applied to account for partial pixel extraction
compute a 1D extracted ERR array by summing the 3 extracted variance arrays and taking the square-root
compute a 2nd set of these 4 VAR+ERR arrays that is in units of surface brightness (to associate with the extracted flux that's in units of surface brightness)
compute a set of 4 VAR+ERR arrays, using the same methods, for the background extraction regions (if local background measurement and subtraction is performed)

In the end, a total of 3 sets of 4 VAR+ERR arrays will be computed and stored in the x1d data table. This will require updates to the SpecModel datamodel schema to account for the new table columns.

stscijgbot-jp commented 2 years ago

Comment by Howard Bushouse on JIRA:

For the record, the current list of column names in the x1d data table is:

WAVELENGTH, FLUX, ERROR, SURF_BRIGHT, SB_ERROR, DQ, BACKGROUND, BERROR, NPIXELS

I'm going to propose the new set of column names:

WAVELENGTH,

FLUX, ERR, VAR_POISSON, VAR_RNOISE, VAR_FLAT,

SURF_BRIGHT, SB_ERR, SB_VAR_POISSON, SB_VAR_RNOISE, SB_VAR_FLAT,

DQ,

BACKGROUND, B_ERR, B_VAR_POISSON, B_VAR_RNOISE, B_VAR_FLAT,

NPIXELS

stscijgbot-jp commented 2 years ago

Comment by Greg Sloan on JIRA:

Others should comment on this, but I believe it would be better to write "ERROR" rather than "ERR" for the different uncertainty columns because this is a standard term expected in spectral data files. On a similar note, the only reason I do NOT insist that "FLUX" be written as "FLUX_DENSITY" (which is more correct for F_nu units like Jy or F_lambda units) is that "FLUX" is what most code expects for the label, not "FLUX_DENSITY".

stscijgbot-jp commented 2 years ago

Comment by Anton Koekemoer on JIRA:

I could go either way on "ERR" vs "ERROR", others can take up the case if needed.

One consideration is the impacts (if any) downstream, in DMS or related subsystems - eg do subsequent pipeline routines need to be revised depending on what these columns are called?

In addition, if these names are flexible with no (or minimal) impact, then for consistency it could help to also propose prepending "FLUX_" to the relevant columns, ie:

FLUX, FLUX_ERR, FLUX_VAR_POISSON, FLUX_VAR_RNOISE, FLUX_VAR_FLAT,

while for "BACKGROUND", could we propose prepending "BKGD" instead of just "B", so these would then be:

BACKGROUND, BKGD_ERR, BKGD_VAR_POISSON, BKGD_VAR_RNOISE, BKGD_VAR_FLAT,

Further iteration/ discussion (and final convergence) could probably best take place in the DMSWG if needed (tagging Alicia Canipe for that)

Also adding James Muzerolle (for NIRSpec) to the watchlist, since although this ticket started off in the context of MIRI LRS, the implications extend to other spectroscopic modes.

stscijgbot-jp commented 2 years ago

Comment by Tyler Pauly on JIRA:

David Law I'm nearing the end of implementing error propagation through the Extract1D step, but I've encountered a hiccup with IFU data. After a conversation with Jane Morrison, I'm trying to figure out exactly how to acquire the input model's VAR_POISSON/VAR_RNOISE/VAR_FLAT arrays, once the IFUCubeModel reaches the Extract1D step. Jane suggested I talk with you, as it sounds as though there have been past conversations on how IFU errors are treated, and VAR arrays have intentionally not made it through the cube_build process. I thought I should post here in case other CalWG folks have input on this.

stscijgbot-jp commented 2 years ago

Comment by David Law on JIRA:

Tyler Pauly That's right, there are no VAR arrays for the IFU data cubes that Extract1D derives 1d spectra from. They propagate through Spec2 until the _cal.fits stage, but cube building does not reformat them into the final 3d structures.

It would certainly be easy enough to do; both SCI and ERR cubes are constructed simply by the weighted sum of their 2d _cal.fits pixel values. These weights are based on the 3d pixel locations with respect to the ouput grid, and could be trivially applied to propagate VAR arrays as well.

However, this would make a lengthy and memory-intensive step even longer and more memory intensive. Cube_build is already unable to build the largest cubes that users could reasonably request, and adding this additional computation would make the problem more acute.

Having VAR information split out among different sources is a reasonable goal, but by far more important is the overall covariance (strictly I mean the correlation matrix). The IFUs are so significantly undersampled that the covariance will be a much more important part of science analyses than the different sources of the VAR (as VAR is not used for weighting in constructing the cubes).

My feeling would be that we should make progress on https://jira.stsci.edu/browse/JP-2015 (accelerating cube_build) prior to introducing anything that would slow it down further. In the meantime I'd suggest that the Extract1D code look for these extensions, and if it can't find them in a _s3d.fits file pass an array of zeros to the Extract1D algorithm instead. That way when we revisit this at a later time the plumbing should be in place so that the only updates necessary are for cube_build to propagate the information.

stscijgbot-jp commented 2 years ago

Comment by Howard Bushouse on JIRA:

Error propagation included in #6014

stscijgbot-jp commented 2 years ago

Comment by Howard Bushouse on JIRA:

Alicia Canipe even though this ticket was originally submitted by Juergen Schreiber I'm going to at least temporarily reassign it back to you now that it's ready for testing (or will be when B7.8 is released) and you can assign it to whoever you like within the INS teams for testing.

stscijgbot-jp commented 2 years ago

Comment by Sarah Kendrew on JIRA:

You can assign the testing to me, Beth Sargent and I produced a notebook for extract1d testing for LRS, I will add a few additional steps to that to look at the error array.

cc Juergen Schreiber feel free to run your own check to make sure it does what you need!

stscijgbot-jp commented 2 years ago

Comment by Juergen Schreiber on JIRA:

I just ran Detector1Pipeline on LRS using JWST pipeline version 1.3.2. The error in the rate and cal files are now non-zero, but show unrealistic high values at least at pixels where low signal is measured. This of course also generates unrealistic errors in x1d files after extract_1d a factor 100 or so higher than the signal. I think the imager has the same problem and this is directly linked to JP-2308.

I attach the found x1d file [^det_image_seq1_MIRIMAGE_P750Lexp1_nod1_x1d.fits]

stscijgbot-jp commented 2 years ago

Comment by Misty Cracraft on JIRA:

Sarah Kendrew Beth Sargent Was this tested in the last round of testing? Or if not, can it be done in the build 7.8 patch testing in the next few weeks?

stscijgbot-jp commented 2 years ago

Comment by Howard Bushouse on JIRA:

Testing in B7.8.x will determine whether the error column(s) are populated now in x1d products, but the issue with anomalously high error values reported recently by Juergen Schreiber won't be fixed until B7.9 (or the current master branch on github) due to the inclusion of JP-2293.

stscijgbot-jp commented 2 years ago

Comment by Beth Sargent on JIRA:

Misty, I am not sure if this was tested in the last round of testing.

stscijgbot-jp commented 2 years ago

Comment by Sarah Kendrew on JIRA:

Just to confirm that Juergen Schreiber ran the test for this and reported further issues. From what Howard Bushouse says this will not be addressed before B7.9, so it is not "ready for testing" anymore at this point.

stscijgbot-jp commented 2 years ago

Comment by Howard Bushouse on JIRA:

If the unusually high error values in the x1d products were due to the problem fixed by JP-2293, then this should be ready for testing now if you use the master branch from github. If you want to wait to test with a formal release, then you'll need to wait for B7.9.

stscijgbot-jp commented 2 years ago

Comment by Juergen Schreiber on JIRA:

I tried it again with the current master (1.3.4 developer version) and it showed now more realistic results. Low error outside the signal and error values depending on the signal values. The errors in the signal seem to be quite low ( a few per mille of the signal values) for my feeling ... but for my needs this ticket could be closed after testing.

spacetelescope / jwst

Spec2Pipeline LRS processing: Extract1d does not produce error column (all 0) in *x1d.fits file #5373