In the hourly files, need to pass for every hour the last 10 min value as 'var_i' to replace instantaneous values previously in tx files

BaptisteVandecrux commented 2 months ago

For older logger programs, instantaneous values were only transmitted and not saved on the logger file because they are the same as the 10 minute average, which are saved on the logger files.

When trimming the tx files, these transmitted instantaneous values were also trimmed, and are therefore missing from the new AWS data files. If we want to continue having (all) the instantaneous data in the hourly files, we need to extract 10 min values at the end of each hour (need to be checked) and assign these values to the corresponding instantaneous variable.

Here's an illustration for CP1: CP1_16

PennyHow commented 2 months ago

I think we decided to pass the 10-minute raw data, as it is identical to the instantaneous values. We have not yet implemented this in the bufr re-processing though. Is this correct, @ladsmund?

In the case of CP1 above, we should have the corresponding raw data, therefore we will use the 10-minute raw data.

BaptisteVandecrux commented 2 months ago

We have not yet implemented this in the bufr re-processing though.

Some users might also be interested in hourly instantaneous values in the level 3 files on dataverse and THREDDS, so it should be addressed within L0toL1 or L1toL2.

PennyHow commented 2 months ago

@ladsmund and I just had a discussion about this. In order to include the hourly instantaneous values in the Level 3 files, we would need to either:

Process all Level 0 tx files --> This will drastically slow down operational processing
Incorporate 10-minute raw data into the instantaneous variables at L0toL1 or L1toL2--> This alters the definition of what the instantaneous variables are, which might become confusing

So we had another idea: we distribute all instantaneous values (i.e. 10-minute raw data AND hourly tx instantaneous values) as a separate Level 3 instantaneous data product. This would be beneficial because:

@ladsmund could perform BUFR re-processing from this product
Enables total transparency to DMI and other users of the instantaneous values
Provides a clear difference between instantaneous values and the averaged values (i.e. the current Level 3 product)

I'm not sure where we would implement this at the moment. I don't think it needs to be operational on an hourly level. But perhaps a weekly or monthly routine that we call to make/update a Level 3 instantaneous product.

BaptisteVandecrux commented 2 months ago

I agree that in the future, chopping the data into different files (one for core +derived data, one for instantaneous data, one for quality flags...) might be the way to go.

Incorporate 10-minute raw data into the instantaneous variables at L0toL1 or L1toL2--> This alters the definition of what the instantaneous variables are, which might become confusing

In a way, that is already what the logger program does:

it calculates 10 min averages every 10 min
every round hour, it takes the the last 10 min value and places it under the <var>_i variable in the 60 min table

So, as an intermediate solution, we could have a function that does the same in pypromice.

BaptisteVandecrux commented 1 month ago

Some important info related to #300:

for each hourly timestamp, the value of t_u is the average of the following hour, whereas the value of t_i correspond to the average of the last 10 minutes
When fetching instantaneous values from 10 minute file, the upper boom is always used. This causes slight changes for some GC-Net stations at which, during the first years, instantaneous values were taken from the lower boom. Now all AWS should use upper boom for transmitted instantaneous values.
instantaneous data transmission is relatively recent. Now that we extract 10 min data as "instantaneous" for all round hours, there are many years where new values for t_i, wspd_i... etc are now available.
In a similar way, instantaneous values have sometimes been transmitted daily, sometimes 6-hourly, and more recently, hourly. Now every round hour having a 10 min value will have an instantaneous value. See illustration below where orange is new instantaneous values extracted from 10 minute files while the blue triangles are the 6h transmissions previously used:

BaptisteVandecrux commented 1 month ago

fixed in main in #302 and #304

GEUS-Glaciology-and-Climate / pypromice

In the hourly files, need to pass for every hour the last 10 min value as 'var_i' to replace instantaneous values previously in tx files #285