Training on real JWST data

els1 commented 2 years ago

Train PAHFIT on different type of sources to optimize science packs. Provide training sample in separate repository

jdtsmith commented 2 years ago

Is anyone working on this? Would be great to have a JWST/MIRI/NIRSPEC realistic training spectral segment set ready to go for testing/optimization of the model.

jdtsmith commented 2 years ago

Sounds like @drvdputt had some good ideas to collapse his MIRI-segment cubes into matched 1D spectra. Would be great if you could take a look at that, for inclusion in data/.

drvdputt commented 2 years ago

I have been looking at the MIRI cubes of VV114. The first thing I wanted to see is how big the jumps in the spectrum are when a 1d extraction is done.

I found a way to do a basic aperture-based extraction without having to loop over the wavelength slices. Once can use photutils to make a SkyAperture circling the area of interest, that can then be converted into a pixel mask tailored to every cube. The extraction then comes down to a numpy multiplication of each cube with the mask, and then summing the result. See to_pixel and to_image() in this explanation https://photutils.readthedocs.io/en/stable/aperture.html.

Here is channel 4, collapsed over all the pixels in black, and collapsed over the aperture mask in red (and normalized to make comparing the jumps easier). The jumps are clearly smaller. So I think we can use this method as a quick way to get test spectra.

bqplot

jdtsmith commented 2 years ago

This is great, thanks. On the call yesterday we discussed how the "12 cubes" output of the pipeline is preferable for us (@ThomasSYLai may know more how easy these are to get hands on). It would be very, very helpful for the new model developments to have a set of 1D spectral extractions from your aperture approach for 12 MIRI MRS cubes (+NIRSPEC segments in it's various modes). They don't have to be perfect to get started, and we can always replace them.

drvdputt commented 2 years ago

I was able to download the Stage 2 products for VV114, and run the Spec3 pipeline on those products with the 'band' option for cube_build, giving me the 12 cubes output.

However, there seem to be some holes in those cubes (pixels flagged as NON_SCIENCE, i.e. DQ = 512). Strangely enough, the cube-per-channel cubes downloaded from MAST did not have this problem. I checked the metadata, and the files on MAST were reduced using used an older version of the pipeline (1.5.3 instead of 1.6.2).

Bottom line is, I have a set of 12 cubes, with 1D extractions, but there are some issues.

Here's a quick preview of what I have so far (notice some of the jumps) Figure_1

And the extraction (the highlighted circle in the image is just a visual aid), with the white pixels being the holes (some are the same, and some are different for every slice) Figure_2

jdtsmith commented 2 years ago

This is great, thanks Dries. Questions: are these the full wavelength range for each segment? Are these background-subtracted? Are the low spikes in the spectra the "holes" you mention? You'd think dithering would fill them in.

Definitely some residual segment mismatches. I wonder what David Law's pipeline is doing to correct those. Also the fringing seems less than in the version @alexmaragko shared.

It would be great if you could stick them in data/ for now; we can always update that when we have an improved set of spectra.

drvdputt commented 2 years ago

Some answers:

It is the full wavelength range (or at least, whatever wavelength grid the cube_build step generates)
The background subtraction occurs in the master background step of Spec3.
I thought I was using only half the exposure. The numbers in the file names are quite cryptic... I was using only those starting with jw01328020001, and not those with jw01328021001, because the latter were not listed in the association file is was using. But this is because the latter are background exposures (you can find this information in the association pool file)

drvdputt commented 2 years ago

Another update: running the whole pipeline starting from the raw data, and properly separating the background and object exposures yields good quality cubes!

When the rest of the reduction is finished, I'll be able to provide usable cubes (with fringing still).

Afterwards, I will try to apply the residual fringe step.

jdtsmith commented 3 months ago

Lots of progress across multiple teams here, closing.

PAHFIT / pahfit

Training on real JWST data #206