GEUS-Glaciology-and-Climate / pypromice

Process AWS data from L0 (raw logger) through Lx (end user)
https://pypromice.readthedocs.io
GNU General Public License v2.0
12 stars 4 forks source link

Need to remove daily transmission from hourly data or for a field indicating the timestep duration or bounds #244

Open BaptisteVandecrux opened 1 month ago

BaptisteVandecrux commented 1 month ago

It happened quite often that raw data from loggers were lost or corrupted. For these periods, transmissions are used instead. But transmissions used to be daily in the winter and therefore daily values have been inserted as hourly values at many sites.

See an example at KAN_U, all these lines are daily averages found in the KAN_U_hour.csv file:

2011-03-22 22:00:00,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
2011-03-22 23:00:00,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
2011-03-23 00:00:00,800.0,-37.52,66.39,95.567,0.1259,6.562,129.6002,5.0561,-4.1828,179.8978,0.0,135.8603,0.0,,124.7662,169.0805,0.5448,-38.9946,0.8544,31.9487,1.4489,0.6641,,,,,,-18.03,-15.54,-16.72,-15.12,-12.91,-10.01,-9.29,-8.18,1.2293,-1.4273,,67.000269,-47.019392,1842.0,55.6,,0.96,13.8,131.6,,-37.28,,,,,,,,
2011-03-23 01:00:00,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
...
2011-03-23 23:00:00,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
2011-03-24 00:00:00,808.0,-34.92,67.9,95.3489,0.1653,9.54,107.4997,9.0985,-2.8687,188.3305,0.0,139.2675,0.0,,129.8552,177.5869,0.4749,-36.0914,1.0826,35.8358,1.4466,0.6519,,,,,,-18.18,-15.7,-16.81,-15.05,-12.94,-10.03,-9.31,-8.19,1.2293,-1.4273,,67.000241,-47.019418,1839.0,40.8,,0.99,13.73,130.1,,-34.93,,,,,,,,
2011-03-24 01:00:00,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
...
2011-03-24 22:00:00,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
2011-03-24 23:00:00,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
2011-03-25 00:00:00,803.0,-19.16,79.95,96.3603,0.8349,15.78,132.0001,11.7268,-10.5589,147.8705,0.0,128.9608,0.0,,191.2945,231.452,0.5085,-20.0451,4.9209,40.3835,1.403,0.6615,,,,,,-18.27,-15.82,-16.94,-15.22,-13.03,-10.09,-9.33,-8.2,1.2293,-1.4273,,67.000206,-47.019256,1843.0,40.8,,1.29,13.73,137.7,,-19.34,,,,,,,,
2011-03-25 01:00:00,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
...
2011-03-25 23:00:00,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
2011-03-26 00:00:00,808.0,-7.73,85.9,92.5955,2.2639,10.34,163.5001,2.9367,-9.9142,100.5111,0.0,89.0119,0.0,,266.794,276.0827,0.9102,-8.9231,5.3005,36.5443,1.3672,0.6693,,,,,,-18.37,-15.96,-17.13,-15.41,-13.11,-10.13,-9.37,-8.22,1.2293,-1.4273,,67.000195,-47.019263,1846.0,40.8,,1.65,13.83,129.0,,-7.58,,,,,,,,
2011-03-26 01:00:00,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
...
2011-03-26 23:00:00,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
2011-03-27 00:00:00,808.0,-11.61,72.8,81.5026,1.4125,6.959,138.3996,4.6203,-5.2039,167.0358,0.0,133.816,0.0,,196.0347,253.6582,0.2105,-14.0743,1.6127,52.4767,1.365,0.6683,,,,,,-18.26,-15.99,-17.13,-15.5,-13.18,-10.18,-9.4,-8.24,1.2293,-1.4273,,67.000247,-47.019239,1852.0,40.6,,1.66,13.78,128.7,,-11.32,,,,,,,,
2011-03-27 01:00:00,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
...
2011-03-27 23:00:00,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
2011-03-28 00:00:00,801.0,-11.86,80.0,89.7822,1.5347,6.422,163.9001,1.7809,-6.1701,171.0392,0.0,132.368,0.0,,197.3682,253.3392,0.2388,-14.1682,8.2197,45.4432,1.3536,0.6621,,,,,,-17.97,-15.85,-17.02,-15.53,-13.25,-10.23,-9.44,-8.24,1.2293,-1.4273,,67.000278,-47.019405,1840.0,40.8,,1.34,13.73,129.9,,-11.39,,,,,,,,
2011-03-28 01:00:00,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
2011-03-28 02:00:00,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,

It also cause some problems because those daily values are filtered out when interpreted as hourly values: https://github.com/GEUS-Glaciology-and-Climate/pypromice/issues/243

So either: 1) daily values should only appear in the daily file and the corresponding period in the hourly file should be left blank 2) duration or time bound fields could be added to indicate that there are mixed time steps

I believe data users do not always go through the whole doc and use a "hourly" file as hourly values without looking for more details. So I'd be opting for option #1.