Keck-DataReductionPipelines / KPF-Pipeline

KPF-Pipeline
https://kpf-pipeline.readthedocs.io/en/latest/
Other
10 stars 2 forks source link

Accurate DATE-MID in master files #850

Closed awhoward closed 2 months ago

awhoward commented 2 months ago

This is mostly a question/task for @RussLaher.

We need accurate timestamps in the masters so that we can drift correct the wavelength solutions between the morning and evening masters. Currently, they do not appear to be accurate.

Here is an example of the morning and evening masters having the same timestamps. I also saw this behavior on other days that I examined in 2023.

shrek% fitsheader -e 0 /data/kpf/masters/20230607/kpf_20230607_master_arclamp_autocal-lfc-all-eve_L1.fits | grep DATE-MID
DATE-MID= '2023-06-07T02:30:29.441' / Halfway point of the exposure, unweighted 
shrek% fitsheader -e 0 /data/kpf/masters/20230607/kpf_20230607_master_arclamp_autocal-lfc-all-morn_L1.fits | grep DATE-MID
DATE-MID= '2023-06-07T02:30:29.441' / Halfway point of the exposure, unweighted 

I examined some more recent masters (from March 2024), and the DATE-MID values are more sensible.

shrek% fitsheader -e 0 /data/kpf/masters/20240307/kpf_20240307_master_arclamp_autocal-lfc-all-morn_L1.fits | grep DATE-MID
DATE-MID= '2024-03-07T18:33:07.804' / Halfway point of the exposure, unweighted 
shrek% fitsheader -e 0 /data/kpf/masters/20240307/kpf_20240307_master_arclamp_autocal-lfc-all-eve_L1.fits | grep DATE-MID
DATE-MID= '2024-03-07T03:05:44.211' / Halfway point of the exposure, unweighted 

Is this something that was fixed recently and the 2023 masters were just run with an old version of the DRP?

After some discussion with @bjfultn and @shalverson, we agreed that DATE-MID is the keyword that we would like to use to keep track of accurate times in the masters.

How is the average value of DATE-MID currently computed? A mean or median value seems like a sensible solution. (Also note that the FITS comment of Halfway point of the exposure, unweighted for DATE-MID is incorrect for masters that are composed from multiple exposures.)

@bjfultn and @shalverson, please comment if I forgot something from our discussion.

RussLaher commented 2 months ago

In the corresponding 2D master file (kpf_20240307_master_arclamp_autocal-lfc-all-morn.fits), in each GREEN_CCD, etc. extension, there is:

MINMJD = 60376.772659 / Minimum MJD of arclamp observations

MAXMJD = 60376.777688 / Maximum MJD of arclamp observations

The average of these would give the mid-point of the coadd. DATE-MID should be ignored since its value is representative of one of the frames in the coadd.

I take it you would like in the L1 file (kpf_20240307_master_arclamp_autocal-lfc-all-morn_L1.fits) to have the DATE-MID co-opted to be the halfway point of the coadd, correct?

DATE-MID= '2024-03-07T18:33:07.804' / Halfway point of the coadd

But DATE-MID is already defined as something else (Halfway point of the exposure, unweighted) and overloading a keyword with another meaning is bad.

Also, I checked the coadd code, and the halfway point depends on the chip. Would it be acceptable to ignore DATE-MID and instead have AVGMJD in each chip extension?

awhoward commented 2 months ago

Okay, this is complicated. I'd say that the first priority is that we want to have at least one clear representation of the mid-time. Your suggestion is to use some variation on JD. This could be written to a keyword (AVGMJD?). I like DATE-MID because it's human-readable, but any time standard consistently implemented should be fine for this purpose. Personally, having DATE-MID serve the dual purpose of being in the middle of the exposure and the middle of the exposure stack seems natural to me. I understand your point though.

The second issue is that incorrect times in the header are confusing. Let's consider how to handle this issue (without creating an excessive amount of work). A starting point is to have documentation about the file formats for the master files. I'll add a GitHub Issue about this with a brief outline of where the descriptions should go (to parallel the descriptions of L0/2D/L1/L2 files). Let me know if there's documentation about this already.

Another possibility to avoid confusion is to strip out all keywords except for a defined subset that applies to masters. Alternatively, the values of these keywords could be set to empty strings if the keywords need to stay there for the data model. Thoughts on this?

@shalverson might comment, but my intuition is that the timing precision required for master stacks is not as significant as for other parts of the DRP. We are not computing barycentric corrections for master files (the usual reason for needing high timing precision). This means that a single mid-time for the stack should be fine, and the details of whether it is a weighted mean or median of the stack elements aren't that important.

RussLaher commented 2 months ago

Then I will do the following:

  1. Strip out all optional keywords from the PRIMARY header. We can repopulate it later with the necessary informative keywords, once we decide what they are, and they need to be populated methodically (not simply copied from the primary header of a representative individual frame in the coadd).
  2. Write both MIDMJD and DATE-MID to each relevant chip extension (same date/times in different formats). I am not strictly against repurposing DATE-MID as long you approve it. MIDMJD = 0.5 * (MINMJD + MAXMJD)
RussLaher commented 2 months ago

Here is an example from the GREEN_CCD extension of kpf_20240423_master_flat.fits:

EXTNAME = 'GREEN_CCD' / extension name ... MINMJD = 60423.000322 / Minimum MJD of flat observations MAXMJD = 60423.999378 / Maximum MJD of flat observations MIDMJD = 60423.49985 / Middle MJD of flat observations DATE-MID= '2024-04-23T11:59:47.040Z' / Middle timestamp of flat observations