geminiplanetimager / gpi_pipeline

Gemini Planet Imager Data Reduction Pipeline
Other
8 stars 6 forks source link

Proper handling of exposure time sum for combined images #60

Closed dsavransky closed 8 years ago

dsavransky commented 8 years ago

If you combine multiple frames in some manner, we should be recording the sum exposure time somewhere in the FITS header. This is useful both for science data, and for calibration data (e.g. a flat field that is the sum of 10 x 60 s flat is almost certainly a better flat than an individual 60 s flat). Right now this is getting recorded in FITS HISTORY, but is not being written as a numeric FITS keyword.

I discussed this in person with Fredrik. He concurs that it's reasonable to update the ITIME keyword to reflect the total integration time in a given output data file. Therefore, any routines which combine multiple images should update the ITIME keyword accordingly with the sum of all individual ITIMEs.

(Of course, this doesn't apply to darks, because we need to match those based on the original integration time)

dsavransky commented 8 years ago

Recovering rm journals for issue:

marois: and what do we do with the flux? Should the flux represents the new itime? As an example, if you do a median of 10 images, do we multiply the flux by 10 or not?

mperrin: Oh, good question. How do you think it should work? I always prefer to work in units of "counts/second" precisely because it avoids complications with different ITIMEs. But right now we're not writing outputs in those units, we're staying in the input unit of "ADU per coadd". Hmm.

marois: We could move to a count/s and just add in the itime, but then we lose some information about how long each image was. We could use the ADU/coadd and just change the coadd number, but then what happen if we want to add images with various itime? I do not see an easy way out of this beside (1) changing the itime and lose some information or (2) add keywords.

mperrin: Agreed, there is no perfect solution.

Gemini does not have a standard answer to this (because of course right now they don't distribute any reduced data products at all).

The standard adopted for HST, Spitzer, JWST, etc is that all their data products are distributed in flux per time units (either counts/second, electrons/second, MJy/sr, or similar units). I think this is the cleanest approach, because it gets closer to the physics that we really care about.

For instance consider the case when we're looking at given science target on two different nights, but we happen to choose different integration times (say, 45 s one time, 60 s the other). Wouldn't it make the most sense to have all the lenslet values be in counts/second values so we can compare the files directly?

Here's my suggestion. We should always write out combined files in units of counts/second. That way, it makes sense to sum all the ITIMES, and if you are medianing or summing data, you get similar physical units either way. If you want to get back to the individual integration times, in the simple case where there is just one unique ITIME, you can get it from the ITIME0 keyword. In the more complicated case of multiple ITIMEs, you will have to look at the FITS history but how often do you think that will be needed? I don't think that's a very often needed use case.

Bear in mind that the GPItv viewer can easily toggle between units of counts vs. counts/second vs. several other kinds of units, independently of what the file is actually written in.

mperrin: Can I get some comments/opinions on the following:

"The standard adopted for HST, Spitzer, JWST, etc is that all their data products are distributed in flux per time units (either counts/second, electrons/second, MJy/sr, or similar units). I think this is the cleanest approach, because it gets closer to the physics that we really care about.

For instance consider the case when we're looking at given science target on two different nights, but we happen to choose different integration times (say, 45 s one time, 60 s the other). Wouldn't it make the most sense to have all the lenslet values be in counts/second values so we can compare the files directly?

Here's my suggestion. We should always write out combined files in units of counts/second. That way, it makes sense to sum all the ITIMES, and if you are medianing or summing data, you get similar physical units either way. If you want to get back to the individual integration times, in the simple case where there is just one unique ITIME, you can get it from the ITIME0 keyword. In the more complicated case of multiple ITIMEs, you will have to look at the FITS history but how often do you think that will be needed? I don't think that's a very often needed use case."

ingraham: this is stale and probably not going to be addressed.

The combination of images is generally done using the pyklip routines etc, and not handled by the pipeline.

Anyone against rejecting it?

ingraham: No one spoke up. Rejecting this.