danpower101 / crspy

crspy is a python package for the processing and calibration of cosmic-ray neutron sensors
Other
13 stars 4 forks source link

Resampling MOD to the hour #1

Open danpower101 opened 4 years ago

danpower101 commented 4 years ago

If a reading is in the database as taken at 11:12, this means the values are from 10:12 - 11:12. We want to adjust this so that the reading is given at 11:00 which makes matching to external products a much easier task.

The issue is some errors occur, occasionally two readings are given in an hour that seem like they are both reasonable and don't deviate from what we expect. Here we can presume that the reading has been provided twice in error? For example, a reading of 2,200 given at 11:12 and one of 2,100 given at 11:32. Here it can be presumed that both readings are for their respective previous hours and here we want an average of the two.

Sometimes the error is a summation error, that is a reading is given at 11:12, and a reading given at 11:22. The second reading is low (e.g. MOD = 57) so that one can presume that the second reading is accurate, but the timestamp was too short. Here we would want a summation of the two.

Currently floored to the hour and duplications are discarded. This may remove "good" data.

In the future need to consider which method is best, the one that removes as little data as possible and is most accurate.

danpower101 commented 4 years ago

image

Example where resample has averaged two values that should probably have been summed