Closed carloshorn closed 4 years ago
Next step is the user interaction. I would add another section into the pygac config which gives the possibility to add the path to a user defined calibration.json
file.
I think it would be good if the user could give a reduced json file which is used to update the default coefficients, or do you think that this makes it harder to track the source?
I would of cause log any changed coefficient.
Being able to just override a small subset of coefficients would be a nice feature! Users focused on traceability can still download an entire set of coefficients via DOI or so.
I'm ok with the config section. Can you please also make this an argument of Reader.__init__
? Then it can be used in satpy like so: https://github.com/pytroll/satpy/blob/master/satpy/readers/avhrr_l1b_gaclac.py#L130
Good idea @sfinkens. Regarding the user friendliness, we should also talk about the coefficient names. These names should be more self-explanatory. Suggestions are welcome! I will try to come up with a list to start the iteration.
Regarding the coeffs names, I would also vote for having a documentation page explaining each coeff and the formulas they are applied in.
And again a feature missing in python 2.... In ancient times, the ConfigParser.get
method did not know fallback values....
With the last commits, the user can provide different default coefficients via the config file in the section "calibration" as option "coeffs_file", e.g.
[calibration]
coeffs_file = /path/to/user/default/coeffs.json
If coeffs_file is not given, it will fallback to the pygac defaults in data/calibration.json
. Furthermore, it is possible to overwrite the defaults by passing custom coefficients directly to the calibrate_solar
, and calibrate_thermal
functions.
This covers the three use cases:
So far the implementation of this feature... Now, we should decide for a good naming convention of these parameters. How about using the naming convention from https://github.com/rsgb-neuhaus/viscal_db/blob/master/csv/header.patmos.csv? This would also avoid list type objects in the json file.
Great job. Thanks a lot @carloshorn for this adaptation.
Nice work @carloshorn ! The viscal_db
names look good to me.
Having in mind that I wanted to avoid list types in the json to give every item a specific name, I came up with the following naming convention, which makes use of the hierarchical json structure.
"metopb": {
"channel_1": {
"dark_count": 39.7,
"gain_high_s0": 0.166,
"gain_high_s1": 2.019,
"gain_high_s2": -0.201,
"gain_low_s0": 0.055,
"gain_low_s1": 2.019,
"gain_low_s2": -0.201,
"gain_switch": 501.12
},
"channel_2": {
"dark_count": 40.0,
"gain_high_s0": 0.183,
"gain_high_s1": 1.476,
"gain_high_s2": -0.137,
"gain_low_s0": 0.061,
"gain_low_s1": 1.476,
"gain_low_s2": -0.137,
"gain_switch": 500.82
},
"channel_3a": {
"dark_count": 40.3,
"gain_high_s0": 0.201,
"gain_high_s1": 1.748,
"gain_high_s2": -0.033,
"gain_low_s0": 0.029,
"gain_low_s1": 1.748,
"gain_low_s2": -0.033,
"gain_switch": 501.32
},
"channel_3b": {
"centroid_wavenumber": 2664.3384,
"radiance_correction_c0": 0.0,
"radiance_correction_c1": 1.0,
"radiance_correction_c2": 0.0,
"space_radiance": 0.0,
"to_eff_blackbody_intercept": 0.9970158319134996,
"to_eff_blackbody_slope": 1.7711318
},
"channel_4": {
"centroid_wavenumber": 933.71521,
"radiance_correction_c0": 5.44,
"radiance_correction_c1": 0.89848,
"radiance_correction_c2": 0.00046964,
"space_radiance": -4.98,
"to_eff_blackbody_intercept": 0.9986240957209157,
"to_eff_blackbody_slope": 0.51860807
},
"channel_5": {
"centroid_wavenumber": 839.72764,
"radiance_correction_c0": 3.84,
"radiance_correction_c1": 0.93751,
"radiance_correction_c2": 0.00025239,
"space_radiance": -3.4,
"to_eff_blackbody_intercept": 0.9988311677674785,
"to_eff_blackbody_slope": 0.40059787
},
"date_of_launch": 2012.77,
"thermometer_1": {
"c0": 276.6194,
"c1": 0.050919,
"c2": 1.471e-06,
"c3": 0.0,
"c4": 0.0
},
"thermometer_2": {
"c0": 276.6511,
"c1": 0.050892,
"c2": 1.489e-06,
"c3": 0.0,
"c4": 0.0
},
"thermometer_3": {
"c0": 276.6597,
"c1": 0.050845,
"c2": 1.521e-06,
"c3": 0.0,
"c4": 0.0
},
"thermometer_4": {
"c0": 276.3685,
"c1": 0.050992,
"c2": 1.482e-06,
"c3": 0.0,
"c4": 0.0
}
}
Even if some coefficients have the same prefix, I think that two levels provide a good readability. The only thing that I would like to change is the launch date format. A fraction of year is not very intuitive. I would prefer a classical timestamp like YYYY-mm-dd
. What do you think?
Edit: Maybe someone can come up with a better name for to_eff_blackbody, which should point out that this coefficients should be used for the transformation from thermometer temperature to effective black body temperature. as defined in (7.1.2.4-3) KLM Users Guide.
Looks good! for the thermometer coeffs, why not skip c3
and c4
as they don't seem to be relevant ?
I would prefer to keep c3
and c4
, even if always set to zero, because it is equation (7.1.2.4-1) in the KLM guide which defines them. They are used in the code in the following lines
https://github.com/pytroll/pygac/blob/2d7fbafbf03ae59d14b1fffdd33154358faee139/pygac/calibration.py#L495-L499
If we default them to zero in the code, I would not see the point of having a default value file.
Good point, I agree, thank you for explaining this :)
I will start the implementation of the new structure tomorrow. I would also like to put more comment lines in the calibration functions and maybe rename some inner variables, but it should neither chnage the results, nor should it change the interface (function argument names). Are you okay with adding it into this PR?
Go ahead.
Even comment lines can break python 2.7....
@helgaweb, thank you very much for the comments. Especially for giving the source to the averaging method that we use.
At all, I would like to propose a change in the coefficients that we are using. In my opinion, we do not need to distinguish between high and low gain slope coefficients like PATMOS-x does, because there is a very simple conversion between both as shown in the following lines:
While writing the documentation on the solar calibration, I went again through the paper of Heidinger et al 2010. In the appendix, they describe how to convert dual gain counts to single gain counts.
Based on casual communication with ITT engineers, the dual gain operation was accomplished for channels 1 and 2 by decreasing the calibration slope by 50% in the low end and increasing it by 150% in the end [...]
After a short google search, I found the following information in the book "Remote Sensing Time Series: Revealing Land Surface Dynamics"
Another change in design of the AVHRR/3 instrument was the introduction of a dual-gain feature for the reflective channels 1, 2 and 3A. In order to improve the radiometric resolution of the instrument for low reflectance targets, the dynamic range of the instrument was divided equally in two ranges, i.e. nominally from 0 to 500 counts and from 500 to 1,000 counts. For channels 1 and 2 half of the available Digital Number (DN) range is assigned to the low albedo range from 0 to 25% with the other half to the high albedo range from 26 to 100%. This allows for an increase in the radiometric resolution for dark targets. For channel 3A, the split between low and high albedo range is set at 12.5% albedo (Rao and Sullivan 2001)
Therefore, the equations (A1) and (A2) from Heidinger et al 2010 can be written as
where for channel 1, 2 and for channel 3A.
If we insert this into equation (1) in Heidinger et al 2010
we see, that the first dark count D (note the typo in the paper C_d=D) in equation (A1) vanishes and we get a similar equation
Note that S is the single gain slope. The dual-gain slopes are simply defined by
Looking at equation (6) in Heidinger et al 2010
we can see, that only is affected by the multiplication with to get high and low slope parameters. The other ones remain the same as you can verify in all PATMOS-x calibration files. To recover the single slope parameter from the dual gain parameters, we simply need to divide by . You can also assure yourself from the PATMPOS-x data that (+/- rounding error at last digit)
Example coefficients for NOAA15
0.060 -0.069 0.002 ! ch1 low gain S0,S1,S2
0.179 -0.069 0.002 ! ch1 high gain S0,S1,S2
0.069 0.339 -0.010 ! ch2 low gain S0,S1,S2
0.206 0.339 -0.010 ! ch2 high gain S0,S1,S2
0.025 0.000 0.000 ! ch3a low gain S0,S1,S2
0.175 0.000 0.000 ! ch3a low gain S0,S1,S2
I would propose to keep only single-gain slope parameters and a boolean flag called something like dual-gain
. That would be sufficient to calibrate the visible channels.
Furthermore, I would like to revert to the original definition of b1 == radiance_correction_c1
(linear term in the non-linear correction for channel 4 and 5). This improve the traceability of the code, see #62.
@abhaydd any comments here?
Note, that the last commit may change results, because it corrects a typo in the NOAA-7 coefficients The correct value for channel 1 is 'bh' == 'bl' == 3.792
according to the PATMPS-x coefficents.
https://github.com/pytroll/pygac/blob/2d7fbafbf03ae59d14b1fffdd33154358faee139/pygac/calibration.py#L126-L127
I would propose to keep only single-gain slope parameters and a boolean flag called something like
dual-gain
. That would be sufficient to calibrate the visible channels.
I noticed, that we do not need the extra boolean flag, because we are already setting "gain_switch": null
in the json file in case of single-gain calibration.
Summary of the current status:
For the visible calibration, we use dark_count
, gain_switch
, and single-gain slope coefficients s0, s1, s2
for each channel. In case of a single-gain instrument, the value of gain_switch
is set to null
. The detector ageing is calculated relative to the date_of_launch
parameter, which is now set as a timestamp instead of a float in the unit of years anno Domini. Note, that the conversion from float to timestamp reveals that these numbers do not necessarily match the actual date of launch as known from other sources, e.g. Metop-B was launched on the 17th of September 2012, but the float given in the PATMOS-x file 2012.77
corresponds to the timestamp 2012-10-08 19:40:48
.
For the thermal calibration, we use centroid_wavenumber, radiance_correction_c0, radiance_correction_c1, radiance_correction_c2, space_radiance, to_eff_blackbody_intercept, to_eff_blackbody_slope
for each channel. Note, that the radiance_correction_c1
is defined like the coefficients for the non-linear correction, that means not added by one like in the previous PyGAC coefficents. Furthermore, we use five coefficients c0, c1, c2, c3, c4
for each platinum resistance thermometer.
Hidden (not accounted for in the json file) parameters in our implementation are the prt_threshold
which is used to mask erroneous values for interpolation (set to 50), and two other thresholds for the in-orbit calibration target (ICT) counts and space counts (set to 100), and the window length (set to 51) used for the smoothing. On top there is also the valid range for brightness temperature (set from 170 to 350). If we do not think that anyone is interested in playing with these hidden parameters, I am fine with fixing them inside the code, but I wanted to mention them for awareness and completeness.
Please have a look at the following example parameters:
"metopb": {
"channel_1": {
"dark_count": 39.7,
"gain_switch": 501.12,
"s0": 0.11066666666666668,
"s1": 2.019,
"s2": -0.201
},
"channel_2": {
"dark_count": 40.0,
"gain_switch": 500.82,
"s0": 0.122,
"s1": 1.476,
"s2": -0.137
},
"channel_3a": {
"dark_count": 40.3,
"gain_switch": 501.32,
"s0": 0.11485714285714287,
"s1": 1.748,
"s2": -0.033
},
"channel_3b": {
"centroid_wavenumber": 2664.3384,
"radiance_correction_c0": 0.0,
"radiance_correction_c1": 0.0,
"radiance_correction_c2": 0.0,
"space_radiance": 0.0,
"to_eff_blackbody_intercept": 1.7711318,
"to_eff_blackbody_slope": 0.9970158319134996
},
"channel_4": {
"centroid_wavenumber": 933.71521,
"radiance_correction_c0": 5.44,
"radiance_correction_c1": -0.10152000000000005,
"radiance_correction_c2": 0.00046964,
"space_radiance": -4.98,
"to_eff_blackbody_intercept": 0.51860807,
"to_eff_blackbody_slope": 0.9986240957209157
},
"channel_5": {
"centroid_wavenumber": 839.72764,
"radiance_correction_c0": 3.84,
"radiance_correction_c1": -0.062490000000000046,
"radiance_correction_c2": 0.00025239,
"space_radiance": -3.4,
"to_eff_blackbody_intercept": 0.40059787,
"to_eff_blackbody_slope": 0.9988311677674785
},
"date_of_launch": "2012-10-08 19:40:48",
"thermometer_1": {
"c0": 276.6194,
"c1": 0.050919,
"c2": 1.471e-06,
"c3": 0.0,
"c4": 0.0
},
"thermometer_2": {
"c0": 276.6511,
"c1": 0.050892,
"c2": 1.489e-06,
"c3": 0.0,
"c4": 0.0
},
"thermometer_3": {
"c0": 276.6597,
"c1": 0.050845,
"c2": 1.521e-06,
"c3": 0.0,
"c4": 0.0
},
"thermometer_4": {
"c0": 276.3685,
"c1": 0.050992,
"c2": 1.482e-06,
"c3": 0.0,
"c4": 0.0
}
}
There is still the chance to change some names. I usually use c
for any kind of coefficient, but we could also use the variable name used in the corresponding equation, e.g. the thermometer parameters have the name d0, ..., d4
in equation (7.1.2.4-1) KLM Guide. Any suggestions?
@helgaweb also asked to add a version number of these coefficients inside the file. I guess you mean something like "version": "v2017r1"
for the current set of parameters? I do not know, if this is beneficial. Imagine that for example at some point we use thermal calibration parameters from another source, but keep the solar calibration as it is, which version tag should we use then? Maybe an entry at top level called description
with a plain text containing explanations and citations could do the job. Any suggestions?
@carloshorn great work so far! I haven't looked at the implementation yet, but from what you describe it looks great. A couple of things:
c...
Indeed. It was just an idea, to make this note in the json file to improve the traceability. I like the idea about the top-level description with the source of the default coeffs, however, there is also the pygac documentation Maybe I missed it, but a note in the calibration.py that the the current default time-dependent PATMOS-x coefficients (version: y2017rf; PATMOS-x User guide) are stored in the json.txt file, would be good.
Awesome @carloshorn! I'd also stick to the coeff names from the equations. And I think a traceable versioning would be very beneficial. So users can say: "I used pygac-1.2.3 with coefs v2017". What if we embed a list of "official" coefs in the code:
[{'md5sum': ..., 'version': 'v2017', 'DOI': 'DOI:1234/5678'},
{'md5sum': ..., 'version': 'v2015', 'DOI': None}]
At runtime we could check if the current file matches one of these versions and update the metadata accordingly. Otherwise, print a warning.
Many thanks for the input:
@mraspaud: I changed the date format to an iso format timestamp, e.g.
"2006-10-19T19:37:12.000000Z"
, changed fit coefficient names to the variable names in the corresponding equations, i.e. coefficients having an integer index like the thermometer coefficients "d0", ..., "d4"
, and introduced a md5 check of the json file.
Coefficients with a physical meaning keep a hopefully meaningful name, the only ones I still would like to change are the to_effective_blackbody_...
, unfortunately, their names in the equations are A
and B
, which are often used in the reverse transformation as well. Furthermore, I liked slope and intercept because it is more instructive for a linear equation. Any suggestions?
@helgaweb: I added a description to the json file. Could you have a quick look at it. The idea is to keep the description of what PyGAC expects inside the json file, because this file can be used as template for a custom coefficient file written by a user or for a new releases. The version is not included, because it would not be static and someone changing the file might not update it. The version info comes in a similar way like @sfinkens described.
I'm good with your proposal @carloshorn
Cleanup and finalisation plan: if we agree on the coefficients, I would remove the old2new
and new2old
functions and do the correct mapping inside the Calibrator
class. I used these functions to create the json file from the original coeffs
dictionary. I also used them to make sure that the results never changed by reproducing the original coefficients with the reverse call.
My question here: Should I keep all the rounding calls used in new2old
to exactly reproduce the original results? Note, that the results might have slightly changed for NOAA-7 in any case due to the corrected typo.
I introduced a deprecation warning in the solar calibration when using the correction factor argument corr
in order to keep the units of the output clear. I could make sure that the transformation from scaled radiance to isotropic radiance is done outside in the reader
class. The other possibility would be a change of interface by adding the sun-earth distance correction factor and solar-zenith-angle to the argument list and directly output isotropic radiance. Any preference?
@carloshorn Looks and sounds very good! Thanks! Any specific reason why you did not include the KLM User's guide in the method reference for the thermal calibration in the json file?
Any specific reason why you did not include the KLM User's guide in the method reference for the thermal calibration in the json file?
No... I simply forgot it...
Nice! I would prefer the optional correction for Sun-earth-distance in the reader
class. Makes the code cleaner.
I think it's good if the coefs are identical, so I'd say keep the rounding. I don't have an opinion about the scaled/isotropic radiance thing :)
I agree with keeping the rounding. Is the scaled radiance being output at the moment to hdf files?
Yes.
def calibrate_solar(counts, chan, year, jday, spacecraft, corr=1)
In pygac 1.2 (CLARA-A2 L1C data) the scaled radiance output was corrected for changing Sun-Earth-distance, but not anymore in the updated pygac Version for the test CLARA-A3 L1C data.
Last thing to do are tests:
corr
is used.Anything I forgot?
I implemented the tests into test_calibrate_klm.py
even if they are not klm specific. Are you okay with it, or should I move them into a separate calibration coefficient specific test case, something like test_calibration_coefficients.py
?
@sfinkens, you once mentioned that in order to use coefficients of older PyGAC versions, you cherry-picked them from previous commits. If wanted, you could tell me which commits you used, and which version tag I should give them, so I can produce json files for them and add them together with their md5 hash. I could also provide a tool to convert PATMOS-x coefficients into PyGAC coefficients so we can build internal version tags for all of them, but I guess this is beyond the scope of this PR.
@carloshorn That's very kind of you, but there is no need to do this for our past activities. That's frozen anyway. But for the future it will be very helpful!
Haven't looked at the tests in detail, buy your proposal sounds good. And a separate test case is a good choice in my opinion!
Great work @carloshorn ! I agree with @sfinkens for both points.
Alright, I hope that's it. In summary:
With this PR, the calibration coefficients will no longer be a static part of calibration.py
. The default values are stored in the external file data/calibration.json
, which can be used as template for user defined defaults. The user can add his/her defaults through the PyGAC configuration file, e.g.
[calibration]
coeffs_file = /path/to/user/default/coeffs.json
If the user passes an official PyGAC defaults file (currently there is only one), the code will recognise it. At runtime, it is possible to check the calibration file version using the class method Calibrator.get_version()
.
Furthermore, it is possible to change individual coefficients by passing the satellite specific changes to the reader
class using the keyword calibration
in the constructor, e.g.
my_calibration= {"channel_1": {"dark_count": 43.21}}
filepath = "/path/to/file"
Reader = get_reader_class(filepath)
reader = Reader(calibration=my_calibration)
reader.read(filepath)
In addition, the functions of calibration.py
will have more comment lines explaining each step and giving references to the applied methods, which should be beneficial for the readability and future development.
In favour of giving more control over the calibration coefficients to the user, this PR will export the coefficients from
calibration.py
to an external data file.Closes #62