spacetelescope / jwst

Python library for science observations from the James Webb Space Telescope
https://jwst-pipeline.readthedocs.io/en/latest/
Other
569 stars 167 forks source link

Residual fringe correction needs an algorithm #4531

Closed stscijgbot closed 3 years ago

stscijgbot commented 4 years ago

Issue JP-1273 was created by Alicia Canipe:

From [~kgordon]:

stscijgbot commented 3 years ago

Comment by Anton Koekemoer: The status of this was discussed in [JWST Cal WG Meeting 2020-10-27|https://outerspace.stsci.edu/display/JWSTCC/2020-10-27+Meeting+notes]

and is also scheduled for a full presentation and discussion in [JWST Cal WG Meeting 2020-12-08|https://outerspace.stsci.edu/display/JWSTCC/2020-12-08+Meeting+notes]

stscijgbot commented 3 years ago

Comment by Anton Koekemoer: This was presented in [JWST Cal WG Meeting 2020-12-08|https://outerspace.stsci.edu/display/JWSTCC/2020-12-08+Meeting+notes] with the following discussion:

stscijgbot commented 3 years ago

Comment by David Law: Edited the title of this ticket to reflect that it is ready for prioritization and then attention from SCSB.  The attached PDF above details the algorithm, along with example code linked therein.  I'll update the requirements page early in the new year.

stscijgbot commented 3 years ago

Comment by Jane Morrison: [~dlaw] You mentioned that you attached PDF that details the algorithm. Is that the PDF titled JWSTCalWG_20201208_Kavanagh_MIRI_MRS_Residual_Fringe_Correction.pdf   or do you have a more updated one ?

 

stscijgbot commented 3 years ago

Comment by David Law: [~morrison] Hm, looks like I never attached the PDF for some reason.  I've done so now.  The information is the same as in the presentation that you linked to, but can go into more detail.  It's still on me to update the requirements page, but I won't be able to get to this until mid-March at soonest unfortunately.

stscijgbot commented 3 years ago

Comment by Jane Morrison: [~kavanagh] where can I get the residual fringe correction CDP? I think it is being delivered in CDP-8 but if I can get a preliminary version it would help be start coding this correction

stscijgbot commented 3 years ago

Comment by Jane Morrison: [~kavanagh] could you also post the MIRI-RP-005100-NLC document (referred to as AD1) in the residual fringe document to this ticket.

 

stscijgbot commented 3 years ago

Comment by Patrick Kavanagh:  

Hi [~morrison] . I've attached the CDP and document to the ticket.

stscijgbot commented 3 years ago

Comment by Jane Morrison: [~kavanagh] could you also post the MIRI-RP-005100-NLC document (referred to as AD1) in the residual fringe document to this ticket.

 

Also could I get added to the permissions for miricap403 so I can get access to the residual fringe correction code 

 

stscijgbot commented 3 years ago

Comment by Patrick Kavanagh: Just some further information on the CDP. Indeed, the CDPs will be delivered in CDP8. There will be two, one for each MRS detector. For now, we only have the CDP for the SW detector. I am waiting on some input to the LW detector CDP. The format will be identical.

stscijgbot commented 3 years ago

Comment by Vincent Geers: [~morrison], you're a member of the JWST-MIRI org on GitHub, and should already have (read) access to all our repos, including miricap403. Could you try to see if you can access the code at https://github.com/JWST-MIRI/miricap403

stscijgbot commented 3 years ago

Comment by Jane Morrison: [~kgordon] [~kavanagh] I need a clarification if the Miri mrs sky matching information is subtracted from the data before the residual fringe correction. Currently the mrs_imatch step (MIRI MRS Sky matching)  is run before cube_build to: The {{mrs_imatch}} step “matches” image intensities of several input 2D MIRI MRS images by fitting polynomials to cube intensities (cubes built from the input 2D images), in such a way as to minimize - in the least squares sense - inter-image mismatches in intensity.

However the background polynomial information is stored in the header of the cal files and not subtracted until the beginning of the cube_build step. Should we move this subtraction to the beginning of the residual fringe correction step ? 

stscijgbot commented 3 years ago

Comment by Karl Gordon: [~morrison]: can you give a bit more background to your question?   Does the background need to be subtracted so that the residual fringe correction algorithm to work?

stscijgbot commented 3 years ago

Comment by Jane Morrison: Well that is my question. I was looking at [https://outerspace.stsci.edu/display/JWSTCC/CALWEBB_SPEC3

and that page implies that background subtraction is done before residual fringe correction. This question is probably more for [~kavanagh]n but I wanted both of your thoughts and to know where the pipeline is currently  subtracting the background determined from the mrs_imatch_step

stscijgbot commented 3 years ago

Comment by Patrick Kavanagh: [~morrison] [~kgordon] the algorithm will work whether or not the background has been subtracted. Indeed, according to the CALSPEC3 definition in the CDP document, the background is subtracted before the step. I will check this with the MRS WG and report back.

stscijgbot commented 3 years ago

Comment by Karl Gordon: I feel this is more a question for [~dlaw] or [~bushouse].  I'm not as up on the specific details in this case.

stscijgbot commented 3 years ago

Comment by Patrick Kavanagh: [~morrison] I consulted [~alvaro.labiano] on this issue and it would be preferable to subtract the background before the residual fringe correction. I copy his reply below for reference. Also, I thought a bit more about the effect of background subtraction on the RFC itself and while it won't affect the removal of fringes as such, it could affect the spectral feature identification in the algorithm if the background levels are large. I don't think these background levels should be too large since dedicated/master background subtraction will have already been performed (but feel free to correct me on that) so the overall effect on the residual fringe corrected output should be small. Nevertheless, it would be better and more rigorous to subtract the backgrounds before residual fringe correction.

 

Alvaro: [The background should be subtracted before] as there could be several corrections applied on detector plane, including optimal extraction. Also, it would be useful to have a background-clean detector image for analysis of complex data and possible residual effects.

 

stscijgbot commented 3 years ago

Comment by Jane Morrison: [~kavanagh]

I will move the background subtraction into the residual_fringe correction and remove it at the start of the test.

I downloaded the miricap403 code and I am going through the residual_fringe_correction.py (in the groundTestData directory) and porting over the code into a JWST Pipeline step. I am trying to avoid using the specutils package because that is a third party package still under development.  I want to make sure I am understanding the code correctly. The data is read in and do_correction module loops over the channel (line 303), then the slices in the channel (line 341) and then the columns. My question is in line 367 [ for col in np.arange(ss_data.shape[1]): ] I am reading that is as looping over all the columns in the input file. But we are only interested in the columns in the slice. So no valid data is found for a column unless the column is for the slice and column has a snr > min_src.

I just want to make sure I have not missed anything and that we are only interested in columns falling on a slice. 

stscijgbot commented 3 years ago

Comment by Patrick Kavanagh: Yes, you are right. When starting on a slice, the input arrays are multiplied by the slice mask (lines 350-353) so that only the columns on the slice have any signal. The SNR check then skips all empty columns on the detector until it reaches the slice being processed. This was my way of isolating signal from a single slice on the columns as some columns have two slices cross them.

stscijgbot commented 3 years ago

Comment by Jane Morrison: [~kavanagh] could you point me at some data that I could use as I test getting this code implemented. 

I realized most of the data I have for the MRS is simulated data. 

 

stscijgbot commented 3 years ago

Comment by Jane Morrison: [~kavanagh] could you point me at some data that I could use as I test getting this code implemented. 

I realized most of the data I have for the MRS is simulated data. 

 

If you can point me at some ground test data, LVL1 data. I can run the convert data script to convert it to DMS format and then run the JWST Pipeline on this reformatted data.  Or if you have already convert some ground test data that works too

 

 

stscijgbot commented 3 years ago

Comment by Patrick Kavanagh: [~morrison] I'm afraid I don't have any converted ground test data. For initial testing and development, I used LVL3 files produced with DHAS from the FM MRS-OPT-02 dataset. The algorithm on the GitHub repo reads these files directly using astropy.  I don't have the LVL1 files and, now that I think about it, I don't think I can access the RAL server from home to fetch them again. Would it be possible for you to get hold of the LVL1 MRS-OPT-02 files directly? If not I will try to get hold of them from somewhere. Apologies for the inconvenience.

 

stscijgbot commented 3 years ago

Comment by Jane Morrison: [~kavanagh] I just remembered I have some ground data I used to derive the linearity correction. I think I can use that for now. I have to go into my office in the next few days - I can access the RAL server then and I will get some data from the MRS-OPT-02 test then. 

stscijgbot commented 3 years ago

Comment by Jane Morrison: [~dencheva] [~dlaw]

I am hoping one of you can answer this question. In the assign wcs step for Miri, the regions array (that defines where the slices are) uses a level of an 80% throughput (plane 7 in the reference file) to define a label mapper.  The code that I got from the MIRI EC team for the residual fringe correction uses a 50% throughput to define the location of the slices. This 50% level then is used to flag which columns to use in the data.  To find the wavelength for a particular column I am just using the wcs information ra, dec,lambda= input_model.meta.wcs(x,y). I am getting all nans for columns  that are in the slice gap for the 80% throughput plane. Does that make sense  - I think it does. I just wanted to double check with you guys.  I have to see how important it is to use the 50% throughput plane rather than the 80 % throughput plane. 

stscijgbot commented 3 years ago

Comment by David Law: [~morrison] That's right; we're using an 80% throughput level to define the WCS as we want to be fairly selective about only using the highest-throughput pixels for science.  Anything with a defined WCS can end up mapping to the cubes.  The regions array does contain other throughput masks though, as some steps benefit from using different levels (e.g., for straylight we need a level where we're robustly discarding light from real slices).

I'd have to look at the residual fringe code to see, but my feeling is that should be able to work ok with an 80% level.  The difference between 80% and 50% is quite small.

stscijgbot commented 3 years ago

Comment by Patrick Kavanagh: [~morrison] [~dlaw] There is no substantive reason to use the 50% throughput mask in the residual fringe correction. I think I just selected this early on in development as I wasn't sure what the pipeline was using. It makes more sense to use the 80% mask.

stscijgbot commented 3 years ago

Comment by Jane Morrison: [~kavanagh] We are trying to avoid using third party packages. Do you know of a different routine part of astropy that could be used as a replacement for SpecUtils .spectrum1d ?

I am also looking into what I can use already in astropy for the bayesian library utils. 

stscijgbot commented 3 years ago

Comment by Jane Morrison: [~kavanagh] Since spectils is basically used to find the SNR for a column of data there must be simpler test we can do just using the data to decide if we need to fit the column or not. 

stscijgbot commented 3 years ago

Comment by Patrick Kavanagh: [~morrison] The SNR determination in itself is fairly straightforward. I pulled out the relevant line of code from the snr_derived tool in specutils and passing the column data directly to this works fine so there is no need for specutils or units (see snippets below). Would this be ok to use? I chose the DER_SNR method to decide whether or not to fit a column as it seemed to be the most robust method for all types of spectra that were tested. Do you have another method in mind?

 

{{-------------------}}

Current RFC SNR: {code:python} from specutils.analysis import snr_derived from specutils import Spectrum1D import astropy.units as u import numpy as np

test_wnum = col_wnum.copy() test_flux = col_data.copy() check_spectrum = Spectrum1D(flux=test_flux[::-1] * u.Jy, spectral_axis=test_wnum[::-1] / u.cm)

snr = snr_derived(check_spectrum, region=None) print('Current RFC SNR = {}'.format(snr))

{code}  

Current RFC SNR = 30.993762814131102

{{-------------------}}

Suggested RFC SNR:

  {code:python} import numpy as np

n = len(col_data) signal = np.median(col_data) noise = 0.6052697 np.median(np.abs(2.0 col_data[2:n-2] - col_data[0:n-4] - col_data[4:n])) snr = signal/noise print('New RFC SNR = {}'.format(snr))

{code}  

New RFC SNR = 30.993762814131102

 

 

stscijgbot commented 3 years ago

Comment by Jane Morrison: [~kavanagh] Thanks for the alternate snr calculator. 

On the test data I have I believe I have the step working. We now need to sort out what intermediate products the team wants written out (when save_intermediate_results = True is set).

I am figuring the stat_table would be useful. In the JWST Pipeline we don't have steps that plot results. For testing it might we useful to somehow keep some of this logic in the step - but I was thinking of saving the output (if save_intermediate products = True) and then have a plotting routine that could look at the results. This plotting would not be part of the step just for the team to use for testing. The diagnostic_plot -routine seems to be able to take different kinds of results and plot up the results so maybe we could expand on that routine. (Again it would just be for testing and the team to use, not something for the entire community). First - does the team want additional output written out that could be plotted to test how the step is working ? If so could you advise what data you want and since it would be written at the end of the step what type of data would you want in this output ?

 

stscijgbot commented 3 years ago

Comment by Jane Morrison: [~kavanagh] I have a question on the slice edges. As you know many of the edges of slice can have little valid data. After multiplying the col data by the all_slice_masks the actual data not in the slice is zero. The code for deriving the snr has a noise of zero because the  np.median(np.abs(2.0 * col_data[2:n-2] - col_data[0:n-4] - col_data[4:n])) returns zero.

So I could first pull out valid data  (non zero and not nans) in the column and basically redefine the col to be this new col of valid data Or if noise is zero then do not fit column. I am guessing you want the second option but I wanted to check 

 

stscijgbot commented 3 years ago

Comment by Jane Morrison: [~kavanagh] have you had a chance to talk with other MIRI team members about what sort of intermediate output the team wants? The step would not produce plots - but we could put in hooks to write out data that could later be plotted by the MIRI team.  If a meeting would be useful to hash this out let me know we can set up a web-ex meeting to go through this.

 

stscijgbot commented 3 years ago

Comment by Patrick Kavanagh: [~morrison] Apologies for not getting back to you sooner. Very busy this week getting the CAP tools updated for the rehearsal.

First, it's great that you have the step working!

Yes, the stat_table should be an intermediate output as it is a record of how well the step performed. I agree about the plotting. All of that code was just for testing and development. We can move those tools to a plotting routine that reads from the outputs.

I had some thoughts on what other intermediate output would be good to have. I have a modified version of the code that saves all of the fit and periodogram results for each column which I will use to update the CDP during commissioning. If it is possible to add a hook to save this data then that would be fantastic. 

I think a brief meeting would definitely be useful, just for me to get an idea of what is in the step output and to discuss the intermediate output. I'm generally free in the afternoons (your mornings) so whatever time suits you is fine with me.

 

 

stscijgbot commented 3 years ago

Comment by Jane Morrison:  

[~bushouse]

If the residual fringe correction is run it needs to  first subtract the residual background determined from the mrs_imatch step (if it exists), then apply a residual fringe correction to the data and return the updated model. Then this data will be passed to cube_build. Cube_build then needs to know if the background from mrs_imatch has already been subtracted.  If the residual fringe correction has been skipped then cube_build needs to subtract the residual background. 

I think we need add a new value to the core schema to handle this how  this residual background is handled. Something similar to how a background  image is subtracted. From the core schema we have for normal backgrounds  ( below cut and paste ) - I think we just need to add something similar to keep track if  the residual background determined from mrs_imatch has been subtracted.

{color:#0747a6}subtracted:{color}

{color:#0747a6}            title: Has background been subtracted from data?{color}

{color:#0747a6}            type: boolean{color}

{color:#0747a6}            fits_keyword: BKGSUB{color}

{color:#0747a6}            blend_table: True{color}

stscijgbot commented 3 years ago

Comment by Howard Bushouse: From what I can tell the mrs_imatch, skymatch, outlier_detection, and cube_build steps already pay attention to and set, when necessary, this meta attribute (model.meta.background.subtrated) and hence the new residual fringe correction step can check this existing keyword to see if the background has already been subtracted (it usually won't be), and if not, subtract the background and set BKGSUB=True. When the data are then fed to cube_build (or whatever comes after), it will see that the background has already been subtracted and not subtract it again. So I think most of what you need is already in place.

stscijgbot commented 3 years ago

Comment by Jane Morrison:  I am not talking about the standard background subtraction I am taking about the residual background determined by the mrs_imatch step. The mrs_imatch step determines the residual background but it does not apply it- it just stores the information to be used later in cube_build. 

In core mapping from detector to sky (and subtracting this residual background is done) ifu_cube.py

in map_detector_to_outputframe - lines 1447 to 1452 in ifu_cube.py subtract this background.

It is a little ;) hidden. It is just a residual background but it really should be subtracted before the residual fringe correction. 

stscijgbot commented 3 years ago

Comment by Howard Bushouse: Doing a little more looking in the mrs_imatch and cube_build code, it looks to me like the mrs_imatch step sets meta.background.subtracted to True if it subtracts background from the 2D MRS images and either leaves the attribute as None or sets it to False if it does not subtract the background. The lines you pointed me to in ifu_cube.py look to see if the mrs_imatch step has populated the meta.background.polynomial_info attribute (signifying that background was computed) and, if so, subtracts the background values previously computed and stored in the polynomials by mrs_imatch. So why can't the new residual fringe step also look for those polynomials, subtract them, and then set meta.background.subtracted to True, just like the mrs_imatch would've done if it had subtracted the background and then just update that little section in ifu_cube.py to check the status of meta.background.subtracted, instead of only looking for the existence of the polynomials, and if meta.background.subtracted is True, then don't subtract the background a second time.

stscijgbot commented 3 years ago

Comment by Howard Bushouse: I'm not exactly sure what you mean by the "residual background" and not the main or regular background subtraction. As far as I know, the mrs_imatch step is the only step in the pipeline to compute and potentially subtract background specifically for IFU/MRS mode. The generic background step in the calwebb_spec2 pipeline can also be applied, but that step doesn't do anything with any of the attributes located within the meta.background tree. The only thing it sets to signify that it's been applied is setting the S_BKGSUB keyword to True. So they seem to be completely independent to me.

stscijgbot commented 3 years ago

Comment by Jane Morrison: I should change cube_build to look if meta.background.subtracted is True and then not do it. I thought meta.background.subtacted was for subtracting a background image or a masterbackground. But looking at code I think the if those steps are done - then 

meta.cal_step.back_sub = 'COMPLETE' 

So meta.background.subtracted seems to only be set if the residual background from the mrs_imatch step is subtracted. If that is correct then yes I should be searching for the value. Thanks for clearing that up. 

stscijgbot commented 3 years ago

Comment by Jane Morrison: [~bushouse]  [~dencheva]  [~eisenham] After discussion with [~jdavies] and [~eslavich] it was suggested that the best location for the 3rd party package is jwst/extern.

setup.cfg is set up so Flake8 and tests are ignored in jwst/extern.

So I think I just make a directory in jwst/extern for this code. Add the code in a single commit.

[~bushouse] , [~eisenham] does this seem like the logical place to put this code ?

 

 

stscijgbot commented 3 years ago

Comment by Howard Bushouse: Sounds good to me. And that way - having it all self-contained in one place - will also make it easy to remove in the future, if/when we get it all replaced with something else.

stscijgbot commented 3 years ago

Comment by Howard Bushouse: Baseline implementation done in https://github.com/spacetelescope/jwst/pull/6211

stscijgbot commented 3 years ago

Comment by Patrick Kavanagh: Hi [~morrison]. I've installed the latest jwst with the new step. I can't seem to get it to run. It raises a ValidationError (see below) which I think is related to CRDS. I've tried overriding the fringefreq reference file with my local version but it doesn't seem to help. Is there something I'm missing?

Here are the lines of code and error:

from jwst.residual_fringe import ResidualFringeStep

result = ResidualFringeStep.call(input, override_fringefreq='MIRI_FM_MIRIFUSHORT_FRINGEFREQ_8B.00.00.fits')

 


ValidationError Traceback (most recent call last) /var/folders/16/gpzts38j5p5dxjw1cl7tm2b00000gp/T/ipykernel_23923/733709864.py in 1 filename = 'fringe_flat_cor_photomstep.fits' 2 ----> 3 result = ResidualFringeStep.call(filename, override_fringefreq='MIRI_FM_MIRIFUSHORT_FRINGEFREQ_8B.00.00.fits') 4 5

~/anaconda3/anaconda3/envs/jwst_rfc/lib/python3.9/site-packages/stpipe/step.py in call(cls, *args, **kwargs) 585 586 name = crds_config.get('name', None) --> 587 instance = cls.from_config_section(crds_config, 588 name=name, config_file=config_file) 589

~/anaconda3/anaconda3/envs/jwst_rfc/lib/python3.9/site-packages/stpipe/step.py in from_config_section(cls, config, parent, name, config_file) 247 spec = cls.load_spec_file() 248 config = cls.merge_config(config, config_file) --> 249 config_parser.validate( 250 config, spec, root_dir=dirname(config_file or '')) 251

~/anaconda3/anaconda3/envs/jwst_rfc/lib/python3.9/site-packages/stpipe/config_parser.py in validate(config, spec, section, validator, root_dir, allow_missing) 361 362 if len(messages): --> 363 raise ValidationError('\n'.join(messages)) 364 finally: 365 config.main.configspec = orig_configspec

ValidationError: Config parameter 'self.suffix': missing

 

stscijgbot commented 3 years ago

Comment by David Law: I'd imagine crds is complaining about this being an unknown reference file type as the file doesn't yet exist in crds.  If the format as delivered in CDP-8 works with the as-implemented PR we can look into getting that delivered.

stscijgbot commented 3 years ago

Comment by Jane Morrison: [~pkavanagh] [~eisenham]

I think there are two problems. For testing I ran using a residual_fringe.cfg file. I think that needs to be added to the CRDS location.

But there is also another problem [~eisenham] I am still struggling with the suffix issue. I added self.suffix to the spec but now I get an error

when I run the residual fringe correction: 

Traceback (most recent call last):

  File "/Users/morrison/opt/anaconda3/envs/421_dev/bin/strun", line 28, in

    step = Step.from_cmdline(sys.argv[1:])

  File "/Users/morrison/opt/anaconda3/envs/421_dev/lib/python3.9/site-packages/stpipe/step.py", line 173, in from_cmdline

    return cmdline.step_from_cmdline(args)

  File "/Users/morrison/opt/anaconda3/envs/421_dev/lib/python3.9/site-packages/stpipe/cmdline.py", line 331, in step_from_cmdline

    just_the_step_from_cmdline(args, cls)

  File "/Users/morrison/opt/anaconda3/envs/421_dev/lib/python3.9/site-packages/stpipe/cmdline.py", line 287, in just_the_step_from_cmdline

    raise ValueError(str(e))

ValueError: Config parameter 'self.suffix': missing

stscijgbot commented 3 years ago

Comment by Jane Morrison: [~eisenham] - MY BAD.   I should of set 'suffix' in step not 'self.suffix'

I will fix that now

 

stscijgbot commented 3 years ago

Comment by Howard Bushouse: [~pkavanagh] yes, until the new ref file type is at least defined in the CRDS imap/rmaps, it's likely that even the use of an override like override_fringefreq='MIRI_FM_MIRIFUSHORT_FRINGEFREQ_8B.00.00.fits' won't work, because the system doesn't recognize the "fringefreq" as a valid override type.

stscijgbot commented 3 years ago

Comment by Jane Morrison: [~pkavanagh] I forgot to add the residual_fringe.cfg to the pipeline directory holding all the cfgs. I also found error if the input is a filename rather than association. I have a PR in to fix both errors.  When that gets merged you can run the step. 

This is how I have run the step on an association

strun residual_fringe.cfg l3_asn.json  --override_fringefreq='MIRI_FM_MIRIFUSHORT_FRINGEFREQ_8B.00.00.fits'

or if you have a single file you want to apply the step 

strun residual_fringe.cfg filename.fits   --override_fringefreq='MIRI_FM_MIRIFUSHORT_FRINGEFREQ_8B.00.00.fits'

We need to get the reference files into CRDS but you should still be able to run the step. I can .  You will get a CRDS error:

CRDS - ERROR -  Error determining best reference for 'pars-residualfringestep'  =   Unknown reference type 'pars-residualfringestep'. But the step continues  on and uses the override reference file.