Open bjfultn opened 1 year ago
Below are notes from a discussion between @RussLaher and @awhoward about implementation of Quality Control in the KPF DRP.
Notes:
Here is some (rough!) pseudocode showing how QC would be implemented. This is based on an analogy with QLP run in a Jupyter Notebook and would need to be translated into multiple steps as described below in the notes about recipe implementation.
myQCL0 = QCL0(L0_instance)
value1 = myQCL0.QC_function1()
myQCL0.write_to_header(QC_name1, value1)
myQCL0.write_to_db(QC_name1, value1)
... <additional pipeline operations in the L0 recipe> ...
myQCL0b = QCL0(L0_instance) # open a new instance of QCL0 that includes updates to the L0 object
QC_result1_boolean = QCL0.read_from_headers(QC_name1)
if QC_result1_boolean:
value2 = myQCL0b.QC_function2()
myQCL0.write_to_header(QC_name2, value2) # write to header of L0 object (not L0 file)
myQCL0.write_to_db(QC_name2, value2) # also read_from_db() needed
... <recipe continues on to 2D and L1 processing> ...
myQCL1 = QCL1(L1_instance) # note that QC from L0 and 2D is inherited
QC_result_L0_name1 = myQCL1.read_from_header(QC_name1)
Notes on the process of a developer creating a new QC function
Consider how Quicklook is executed in a recipe as an analogy to build on for QC
recipes/quicklook_match.recipe
which contains this line (Quicklook(open_file, output_dir, level)
) which runs a module framework class called Quicklook that is in modules/quicklook/src/quicklook_prim.py
modules/Utils/analyze_l0.py
, which has a method for each analysis and plotting task.Exposures which have one (or both) of the CCDs "smeared" due to the readout async issue should be flagged/marked as junk. Can likely identify by comparing total flux/inter-order-flux/intra-order-flux with a good frame. Example here: https://jump.caltech.edu/observing-logs/kpf/KP.20230829.71496.04
Another issue that may be related to the DRP failing in /code/KPF-Pipeline/modules/radial_velocity/src/midpoint_photon_arrival.py
with
[KP.20230829.67611.84.log][ERROR]:Failed executing primitive BaryCorrTable: index 5 is out of bounds for axis 0 with size 5
is checking for negative flux in the exposure meter. Here's an example: https://jump.caltech.edu/observing-logs/kpf/KP.20230829.67611.84. I'm not sure if this issue will go away with some change to how the exposure meter table is generated, but picking this up in a quality control check could help the DRP navigate how to proceed correctly and at least generate L2 files with the usable bits of the data. Currently ~25-50% of SoCal data in the last couple weeks has this problem and fails L1->L2 as a result
Thanks, @rrubenza ! I was thinking about a QC function to identify the ‘stuck-at-start’ frames (#1 in your list above). I don’t think there are FITS headers that will reveal this. But the image should have enough information. I think the key is that the image has much more flux than a bias or dark (at least for an exposure with a light source), but that light is spread out over a lot of pixels and is not in spectral orders. So the check could be
Andrew, we should check that this works for very SNR spectra. We don't want to throw anything out accidentally. Some of the very red stars have very low flux in the green CCD. Even checking for a 95% threshold on flux may not be sufficient.
I think a check on header keywords should be done for all images. This can start as a simple count of header keywords. Often times a missing header keyword is an indication of bigger problems. A more complex version would compare a set of master keywords that never change, and a set of master keywords that have some acceptable range.
One way to quality control wavelength solutions is to create a histogram of the differences between a test wavelength solution and some static wavelength solution. The x-axis should be presented in m/s rather than Angstroms or nanometers. The peak of the distribution would give a daily (or weekly) offset, tails of the distribution would be indicative of specific orders that are performing poorly.
There are several observing scenarios in which the sky flux can be problematic in determining precise RVs. Twilight observations, those near the moon, and those taken in heavy clouds could all be corrupted by sky. A quality control check for sky could be a ratio of stellar flux to sky flux for specific wavelengths, or an absolute amount of sky flux present that exceeds some threshold. A flag could be thrown for observations when the ratio exceeds some threshold to be determined. There is already a QLP plot that shows this ratio, so the code to do this already exists.
Another issue that has appeared is some files missing RED CCD data. In some cases, the system failed to take a RED exposure and as a result the L0 extension names include: GREEN_AMP1, GREEN_AMP2, RED (instead of GREEN_AMP1, GREEN_AMP2, RED_AMP1, RED_AMP2) where hdu['RED'].data is empty. A simple check to see if the header name containing 'RED' includes '_AMP*' or on a non-empty data array would easily catch and flag these cases.
@howardisaacson -- you commented above about having a basic set of L0 keywords that a QC check looks for. Below is a preliminary list (which will be in L0_header_keywords_present_check()
in modules/quality_control/src/quality_control.py
). Do you have some to add? We should also consider if there are cases when it's okay not to have certain keywords (e.g., telescope-related keywords when we're only taking calibrations).
essential_keywords = [
'DATE-BEG', #Start of exposure from kpfexpose
'DATE-MID', #Halfway point of the exposure (unweighted)
'DATE-END', #End of exposure
'EXPTIME', #Requested exposure time
'ELAPSED', #Actual exposure time
'PROGNAME', #Program name from kpfexpose
'OBJECT', #Object name
'TARGRA', #Right ascension [hr] from DCS
'TARGDEC', #Declination [deg] from DCS
'TARGEPOC', #Target epoch from DCS
'TARGEQUI', #Target equinox from DCS
'TARGPLAX', #Target parallax [arcsec] from DCS
'TARGPMDC', #Target proper motion [arcsec/yr] in declination from DCS
'TARGPMRA', #Target proper motion [s/yr] in right ascension from DCS
'TARGRADV', #Target radial velocity [km/s]
'AIRMASS', #Airmass from DCS
'PARANTEL', #Parallactic angle of the telescope from DCS
'HA', #Hour angle
'EL', #Elevation [deg]
'AZ', #Azimuth [deg]
'LST', #Local sidereal time
'GAIAID', #GAIA Target name
'2MASSID', #2MASS Target name
'GAIAMAG', #GAIA G band magnitude
'2MASSMAG', #2MASS J band magnitude
'TARGTEFF', #Target effective temperature (K)
'OCTAGON', #Selected octagon calibration source (not necessarily powered on)
'TRIGTARG', #Cameras that were sent triggers
'IMTYPE', #Image Type
'CAL-OBJ', #Calibration fiber source
'SKY-OBJ', #Sky fiber source
'SCI-OBJ', #Science fiber source
'AGITSTA', #Agitator status
]
Here's a simple test for the clocking issue that leaves ~1% of 2D images "smeared" out, it's a chi^2 comparison to a uniform distribution for a vertical column cut of the data. Normal frames should be highly skewed to have a large number of pixels at very low flux (inter-orders), whereas these affected frames have more even distributions of light due to the CCD reading out while still recording photons
where hdu
is a L0 file and q
is one of ['GREEN_AMP1', 'RED_AMP2', 'GREEN_AMP2', 'RED_AMP2']
,
data = np.sum(hdu[q].data, axis=1)
O, bins = np.histogram(data)
nbins = len(bins)
E = len(data)/nbins
chi2 = np.sum((O - E)**2 / E)
chi2/=len(data)
Then chi2<1
should pick out these frames. Just from spot checking several examples, frames with issues tend to be in the 0.1-0.8 range while good frames are in the 2.7-3.5 range.
I like this metric Ryan. I used it to examine the fast/regular etalon dataset that was recently taken. These two histograms show the distribution of Ryan's chi^2 metric when doing the summation over the vertical axis (axis=1) versus the horizontal axis (axis=0). Since there are gaps in between the etalon peaks, there are lots of low flux pixels within the order, unlike for stars, making histogram difference from uniform not as distinct for etalons.
The bad etalon frames are easy to pick out because the RVs are outliers. I will continue to explore some more edge cases such as very faint stellar exposures that may not pass this test, but have okay data quality.
The first plot, with the blue peak near 3 should be axis=0, the plot with the blue peak near 5.5 is axis=1
Thanks for this, @rrubenza and @howardisaacson! How does the metric compare for all of the standard calibrations and a range of SNR spectra (including very red spectra)?
Keeping this open for the discussion about identifying 'stuck-at-start' frames. @rrubenza @howardisaacson The other parts of this issue have been addressed.
@howardisaacson -- here's the tutorial on developing a QC method: https://kpf-pipeline.readthedocs.io/en/docs-pipeline-info/develop/start.html
This is an open thread to discuss the implementation of quality control checks. To start lets detect: