Closed PennyHow closed 7 months ago
Nice catch!
My main concern is that good data could be filtered out before we have a chance of fixing it, or more worrying, before it is transformed to the units that are stated in the variable.csv
. usr
and dsr
are an example of this.
Since there is no calibration or unit conversion on temperature, and I have never seen that variable need adjustment (unlike pressure for example), it would be safest to only clip temperature variables.
And as a note for later, we need to check how often these air temp is missing while the SR50 are still working. If that happens frequently, maybe we want to gap-fill the air temp before it is used to correct the SR50 measurements. But that's more enhancement.
Since there is no calibration or unit conversion on temperature, and I have never seen that variable need adjustment (unlike pressure for example), it would be safest to only clip temperature variables.
And as a note for later, we need to check how often these air temp is missing while the SR50 are still working. If that happens frequently, maybe we want to gap-fill the air temp before it is used to correct the SR50 measurements. But that's more enhancement.
I had spoken to @ladsmund about doing some form of interpolation for the temperature data so that we avoid losing valid SR50 measurements. If there is concern about clipping all variables too early in L0toL1
then we could make a routine that only clips temperature data... then we could also add an interpolation routine, either now or later in the future. I'll make these changes now.
I've moved the temperature correction to a separate routine. And we only use it for clipping and interpolating t_u
/t_l
for correcting z_boom_u
/z_boom_l
. Therefore, the clipped and interpolated temperature data is not carried forward in t_u
/t_l
The clipping is essentially lifted from the pypromice.process.value_clipping
routine, but only clips temperature data.
I think it is a good idea to do the interpolation in order to preserve as many valid z_boom_u
/z_boom_l
values as possible. I've chosen a maximum window of 12 hours then should be interpolated across... if you think this should be smaller or bigger then I am open to suggestions. I chose a temporal window rather than a number of time steps to account for the different temporal resolutions of the raw
(10 minutes) and tx
(hourly) data.
The output is a t_u_interp
/t_l_interp
variable which is retained through the processing routine, but dropped at the end when making the netcdf/csv output file. Therefore we can inspect it if needed.
Bug found where
z_boom_u
/z_boom_l
measurements are jumping wheret_u
/t_l
measurements are below -50.This is because of an error in the lufft sensor that reports bad readings below -50. Here is an example with station HUM:
I found a simple solution for this is to run the range thresholding QC at the beginning of
pypromice.process.L0toL1
. Right now, this is only done at the end ofpypromice.process.L0toL1
AFTERz_boom_u
/z_boom_l
is corrected for witht_u
/t_l
.Instead, we should perform the range threshold at the beginning of
pypromice.process.L0toL1
. This removes errors in the temperature data before applying them to correct the boom height.