Open GoogleCodeExporter opened 9 years ago
> Hi Rob, Iain
>
> A while ago it was requested that the L2 modify the following WAV* keys. I'm
a bit at a loss at what to change and what to change it to.
>
> WAVCENT
> WAVDISP
> WAVRESOL
> WAVSHORT
> WAVLONG
> WAVSET
> WAVCAL
> WAVERR
>
> These will all be different on the product extensions (not primary) from your
L1 settings.
>
> WAVCENT - central wavelength presumably? should this just be set to the newly
calibrated wavelength at half the distance across the dispersion axis?
> WAVDISP - should I set this to my new rebinned figure?
> WAVSHORT - rebinned lowest wavelength?
> WAVLONG - rebinned highest wavelength?
> WAVSET - = WAVCENT?
> WAVCAL - the frame used?
> WAVERR - I don't know how to reasonably estimate this in an autonomous
fashion..
>
> Thoughts?
These are our internal wavelength keywords and are only intended as an
approximate guide. The full WCS which you create is of course the exact
specification. In a sense they only have any real meaning in the uncalibrated
data. Strictly speaking we cannot write a correct WCS into the raw/L1 image
before they have been aligned and scrunched. That's why we have these vaguer
keywords.
We have two options on how to handle them
i) set them all correctly in each extension, in which case most of your guessed
definitions are correct
ii) include them in the primary extension only and delete them from all the
others.
I am currently inclined towards ii. They are only really useful to us since
third party software will not read them. They are used by the data archive
which looks only at the primary extension, so I think if you write them in any
other headers they will never be looked at. In the proper calibrated extensions
they are superseded by the full WCS and no longer relevant? (That is unless
Iain has some plan for them that I am unaware of?)
In which case, my suggestion would be to correct the values in the primary
extension and not include them in the others. The values only need to be
approximate and I think you can calculate what the values should be from your
arc fit. I would just take the central fibre and derive the values for that. Of
course the answer is different for each fibre, but only by a few Angstroms.
First few all relate to the raw/L1 data and how they were observed:
WAVCENT - Yes. Wavelength for the centre of the CCD. Angstrom.
WAVDISP - Dispersion. Angstrom/pixel. If we are applying only to the primary
then it needs to be for the original pixel size in the L1/raw image. You can
get an approx value from your arc fit.
WAVSHORT,WAVLONG - Again, assuming we only write into primary extension then it
only applies to the L1 image, so this is approx wavelengths for pixels 1 and
4096 (assuming an unbinned raw image).
WAVSET - I think I would leave this alone as it is in the raw/L1 file. It is
the requested setting so even if the real data are far off, it should stay as
what was asked for.
These last couple could arguably be written into all extensions? They seems to
have more relevance to calibrate than L1 data?
WAVCAL - Yes. The name of the arc file I guess. This is a bit vague. I am not
really sure what it is to be used for, but you may as well stash that
information.
WAVERR - How to measure. Aha. Now there is a big question. This is the big
remaining question we have had for a long time. Several times I have found
files in which the calibration had clearly gone wrong and we currently have no
way to detect of flag it automatically. Let me make a cup of tea and I will
then try to write a bit more on this....
Original comment by robbarns...@gmail.com
on 1 Nov 2011 at 9:42
> WAVERR - I don't know how to reasonably estimate this in an autonomous
fashion..
> >
> > Thoughts?
Right. I've not been drinking tea all that time. Honest!
What we do with this seems to depend on which headers we write it into.
If we only write in primary:
In this case I think it is fairly easy. We want an estimate of how far off are
the values you have written. It is possible that you got the fits totally
wrong, but that is not really relevant. That's a complete failure and the
"error estimate" is something different. Presuambly (?) the biggest deviation
from the WAV values is not the error in your fits at all, but rather the real
physical variation in fibres since this is before you align and scrunch fibres.
How about a value based on the RMS between your 144 fits? That might be the RMS
value of your 144 "delay" shifts or the RMS in your best estimate of the
wavelength at pixel 4096 (assuming unbinned) or some combination of them both.
You would know better than me, but I would guess that the largest deviation is
the fibre to fibre delay shifts which is a few pixels?
If we write this into all extensions:
The primary extension is as above. In the others which are after calibration
this becomes a proper error estimate. As I said earlier, that's something we
have been sorely missing anyway. As with all error estimates, it is down to you
to figure out where you think the significant sources are. All this needs
probably needs to be done back at the fitting stage and is not trivial.
What are the likely largest causes of errors?
* Accuracy of centroiding the arc lines themselves. Should be very good.
* Flexibility of the fit between arc lines. Presumably this can be derived
analytically during the fitting process. The few the lines, the larger the
possible divergence is going to be.
* Since they are all supposed to be aligned in the final data product, you
could centroid a strong arc line in the linearized arc frame. In theory the RMS
in the centroids ought to be zero, but we know from past experiments that when
the fit gets unstable, the arc lines jiggle about.
* Instead of a per-frame derived value, you could just get an estimate of our
overall reliability and write it as a constant.
Anyway, whatever method you use to estimate the errors:
* Estimating errors is something we have not yet done convincingly in the
pipeline and needs to be better understood irrespective of WAVERR
* I do not mean to belittle its importance and error analysis may be something
your thesis examiners decide to lay into in great detail (I don't know!) but
given that it is something we have not addressed yet, it does not seem to me
that it should hold up software deployment just so you can write something in
WAVERR. I would be inclined to not write anything until such time as you
understand it well enough to actually write something robust and useful.
* If written into the WCS calibrated extensions there may be a better
standardized WCS way of expressing the error than using our made up WAV
keywords.
Original comment by robbarns...@gmail.com
on 1 Nov 2011 at 9:43
On 05/04/2011 10:06 PM, Robert Smith wrote:
>> WAVERR - I don't know how to reasonably estimate this in an autonomous
fashion..
>>
>> Thoughts?
> Right. I've not been drinking tea all that time. Honest!
>
> What we do with this seems to depend on which headers we write it into.
>
> If we only write in primary:
> In this case I think it is fairly easy. We want an estimate of how far off
are the values you have written. It is possible that you got the fits totally
wrong, but that is not really relevant. That's a complete failure and the
"error estimate" is something different. Presuambly (?) the biggest deviation
from the WAV values is not the error in your fits at all, but rather the real
physical variation in fibres since this is before you align and scrunch fibres.
How about a value based on the RMS between your 144 fits? That might be the RMS
value of your 144 "delay" shifts or the RMS in your best estimate of the
wavelength at pixel 4096 (assuming unbinned) or some combination of them both.
You would know better than me, but I would guess that the largest deviation is
the fibre to fibre delay shifts which is a few pixels?
>
> If we write this into all extensions:
> The primary extension is as above. In the others which are after calibration
this becomes a proper error estimate. As I said earlier, that's something we
have been sorely missing anyway. As with all error estimates, it is down to you
to figure out where you think the significant sources are. All this needs
probably needs to be done back at the fitting stage and is not trivial.
>
> What are the likely largest causes of errors?
> * Accuracy of centroiding the arc lines themselves. Should be very good.
> * Flexibility of the fit between arc lines. Presumably this can be derived
analytically during the fitting process. The few the lines, the larger the
possible divergence is going to be.
> * Since they are all supposed to be aligned in the final data product, you
could centroid a strong arc line in the linearized arc frame. In theory the RMS
in the centroids ought to be zero, but we know from past experiments that when
the fit gets unstable, the arc lines jiggle about.
> * Instead of a per-frame derived value, you could just get an estimate of our
overall reliability and write it as a constant.
>
> Anyway, whatever method you use to estimate the errors:
> * Estimating errors is something we have not yet done convincingly in the
pipeline and needs to be better understood irrespective of WAVERR
> * I do not mean to belittle its importance and error analysis may be
something your thesis examiners decide to lay into in great detail (I don't
know!) but given that it is something we have not addressed yet, it does not
seem to me that it should hold up software deployment just so you can write
something in WAVERR. I would be inclined to not write anything until such time
as you understand it well enough to actually write something robust and useful.
> * If written into the WCS calibrated extensions there may be a better
standardized WCS way of expressing the error than using our made up WAV
keywords.
Following my logic from the previous email, I'd suggest setting WAVERR in the
L1 image to a predefined constant, and setting it in the L1 since it only
really relates to the approximate fitting error. I can create my own "L2"
prefixed key, e.g. L2WAVERR, and store the culmination of errors from L2
calibration in that value. As you suggested, I will consider this more after
the release.
So briefly, do you see any problems with the following?
WAVCENT
WAVDISP
WAVRESOL
WAVSHORT
WAVLONG
WAVSET
WAVERR
All set before L2.
WAVCAL removed.
WAVERR set to some appropriate constant relating to the approximate fits.
L2 doesn't touch these at all in primary, but removes them all in further
extensions and replaces them with equivalent keys including:
L2WAVERR
L2ARC
L2WAVDIS
L2WAVCEN
pertaining to the L2 calibrations only.
Original comment by robbarns...@gmail.com
on 1 Nov 2011 at 9:43
>
> I think i'm more inclined towards ii) also but this leaves me with a few
questions!
>
> I currently haven't been updating the WCS in the primary HDU with my L2
calibrated fit. I can very easily do this, but opted not to. I see the L2
extensions as a sequential series of reductions, and by writing the WCS
obtained at a later stage into the primary HDU, I thought it was compromising
the integrity of the data.
I follow your logic but I don't think I would worry about it. We have the
original files backed up should we want to return to them. Second, this change
would not modify any actual data. It is only correcting metadata which was
originally written wrongly (or at least guessed at).
> I'd argue that WAVCAL really shouldn't be there at all in the L1 image
headers. If this data is to be kept wouldn't it better if I wrote an additional
key for this e.g. L2ARC in only the extensions using a L2 calculated WCS fit? I
can do this for any other data you wanted retaining, such as the L2 equivalent
of WAVCENT and WAVDISP, but prefixing them with an "L2" extension.
I agree with that. I'm not sure what the plans for WAVCAL were. I suspect it
was (at least in part) just Chris making guesses at what might be worth storing
and I am sure at the time he did that, we had not decided on this multi
extension strategy.
RJS
Original comment by robbarns...@gmail.com
on 1 Nov 2011 at 9:44
I think I prefer the WAV* keywords in the primary header to be correct. They
are the ones folk are most likely to look at and currently the ones used by the
data archive, though that could be changed.
I agree with your statement that the earlier extensions should not describe the
later reduction products, but I don;t think that is what would happen if you
were to update the keywords. You are simply using information derived later in
the analysis process to go back and correct values which were previously only
guessed at. By updating the WAV* keywords you are only making them correctly
describe the data to which they are a header.
I don't think there is any confusion or ambiguity since these will be *_2.fits
files. They are clearly after processing.
> > Following my logic from the previous email, I'd suggest setting WAVERR in
the L1 image to a predefined constant, and setting it in the L1 since it only
really relates to the approximate fitting error. I can create my own "L2"
prefixed key, e.g. L2WAVERR, and store the culmination of errors from L2
calibration in that value. As you suggested, I will consider this more after
the release.
> >
> > So briefly, do you see any problems with the following?
> >
> > WAVCENT
> > WAVDISP
> > WAVRESOL
> > WAVSHORT
> > WAVLONG
> > WAVSET
> > WAVERR
> >
> > All set before L2.
> >
> > WAVCAL removed.
> > WAVERR set to some appropriate constant relating to the approximate fits.
That is all fine except that it would be my vote to update the values to the
best available information, not leave them them as written by the ICS.
> > L2 doesn't touch these at all in primary, but removes them all in further
extensions and replaces them with equivalent keys including:
> >
> > L2WAVERR
> > L2ARC
> > L2WAVDIS
> > L2WAVCEN
> > pertaining to the L2 calibrations only.
L2ARC is a good idea.
L2WAVERR is a good idea, but we ought to also look into whether there is a
proper WCS standard format. I'll do that.
The other two are probaly of questionable use. They certainly do no harm if you
want to include them, but don't they just replicate the WCS headers?
Original comment by robbarns...@gmail.com
on 1 Nov 2011 at 9:44
On 5 May 2011, at 17:04, Robert Smith wrote:
>>
>> I think i'm more inclined towards ii) also but this leaves me with a few
questions!
>>
>> I currently haven't been updating the WCS in the primary HDU with my L2
calibrated fit. I can very easily do this, but opted not to. I see the L2
extensions as a sequential series of reductions, and by writing the WCS
obtained at a later stage into the primary HDU, I thought it was compromising
the integrity of the data.
>
> I follow your logic but I don't think I would worry about it. We have the
original files backed up should we want to return to them. Second, this change
would not modify any actual data. It is only correcting metadata which was
originally written wrongly (or at least guessed at).
>
>> I'd argue that WAVCAL really shouldn't be there at all in the L1 image
headers. If this data is to be kept wouldn't it better if I wrote an additional
key for this e.g. L2ARC in only the extensions using a L2 calculated WCS fit? I
can do this for any other data you wanted retaining, such as the L2 equivalent
of WAVCENT and WAVDISP, but prefixing them with an "L2" extension.
>
> I agree with that. I'm not sure what the plans for WAVCAL were. I suspect it
was (at least in part) just Chris making guesses at what might be worth storing
and I am sure at the time he did that, we had not decided on this multi
extension strategy.
>
> RJS
>
>
This is outlined in Fault log comment number 21 of bug 1279...
http://telescope.livjm.ac.uk/Fault/Bugzilla/show_bug.cgi?id=1279
Original comment by robbarns...@gmail.com
on 1 Nov 2011 at 9:45
Original issue reported on code.google.com by
robbarns...@gmail.com
on 1 Nov 2011 at 9:42