GEUS-Glaciology-and-Climate / pypromice

Process AWS data from L0 (raw logger) through Lx (end user)
https://pypromice.readthedocs.io
GNU General Public License v2.0
14 stars 4 forks source link

Level 3 monthly resampling timestamp is ambiguous #103

Open PennyHow opened 1 year ago

PennyHow commented 1 year ago

We use pandas.DataFrame.resample for resampling our Level 3 hourly product to daily and monthly products.

https://github.com/GEUS-Glaciology-and-Climate/pypromice/blob/89fe4e44b274a04b775b343e617f2144a375e03c/src/pypromice/aws.py#L783

It appears that the resampling for the monthly product produces ambiguous time stamps. This is an example from CEN1:

time       | p_u      | t_u      | rh_u    | rh_u_cor ...
2017-07-31 | 804.4458 | -2.8221  | 90.0261 | 92.4869 ...
2017-08-31 | 799.1244 | -7.9172  | 88.4559 | 94.814 ...
2017-09-30 | 793.0522 | -16.8246 | 84.3948 | 98.4254 ...
2017-10-31 | 789.4513 | -21.8854 | 81.7442 | 99.7256 ...
2017-11-30 | 792.0762 | -30.5804 | 74.9108 | 99.7344 ...

The resampling occurs from the date stated for the following month (i.e. 2017-07-31 is an average from 2017-07-31 to 2017-08-31. Ideally though, we would like the resampling to occur on the first of each month. Then we can produce timestamps that look more like this:

time       | p_u      | t_u      | rh_u    | rh_u_cor ...
2017-08-01 | 804.4458 | -2.8221  | 90.0261 | 92.4869 ...
2017-09-01 | 799.1244 | -7.9172  | 88.4559 | 94.814 ...
2017-10-01 | 793.0522 | -16.8246 | 84.3948 | 98.4254 ...
2017-11-01 | 789.4513 | -21.8854 | 81.7442 | 99.7256 ...
2017-12-01 | 792.0762 | -30.5804 | 74.9108 | 99.7344 ...

Or, to make it even less ambiguous, maybe something like this:

time    | p_u      | t_u      | rh_u    | rh_u_cor ...
2017-08 | 804.4458 | -2.8221  | 90.0261 | 92.4869 ...
2017-09 | 799.1244 | -7.9172  | 88.4559 | 94.814 ...
2017-10 | 793.0522 | -16.8246 | 84.3948 | 98.4254 ...
2017-11 | 789.4513 | -21.8854 | 81.7442 | 99.7256 ...
2017-12 | 792.0762 | -30.5804 | 74.9108 | 99.7344 ...