GEUS-Glaciology-and-Climate / pypromice

Process AWS data from L0 (raw logger) through Lx (end user)
https://pypromice.readthedocs.io
GNU General Public License v2.0
12 stars 4 forks source link

Percentile qc - test #162

Open RasmusBahbah opened 11 months ago

patrickjwright commented 10 months ago

@RasmusBahbah I left a few comments, mostly minor. Overall, I am curious what kind of testing and plotting was done to determine both the limit thresholds (the values in var_threshold), as well as the percentiles to use as baselines for applying the limits (i.e. 0.005 and 0.995).

I remember in our initial testing, we were certainly finding many outliers that could be removed with the percentiles check, but we were also finding instances where good data would be removed. I assume there is no tolerance for removing any good data from the historical dataset, so were you able to do enough testing for each station to determine that the limits in this PR are conservative enough to preserve all good data?

Was air temp the only variable that uses seasonal distributions? All other variables use the full dataset to calculate percentiles?

Due to my limited time for this review, I would also highly encourage you get a review from @PennyHow as well. Great that this is being pushed forward!

RasmusBahbah commented 10 months ago

Thanks for your feedback, @patrickjwright, much appreciated. @ladsmund and I will do some testing of the QC on the pipeline, to determine if we are removing any good data, and tweak the percentiles and thresholds accordingly.

And yes, airtemp is the only variable with seasonal percentiles, and the other variables use the whole time-series. It will be engaging in a future update to test if monthly percentiles or another conf. could improve the QC. Maybe also use it on the other variables.

I really appreciate your time on this! @ladsmund and I will make some improvements on the structure, the staticQC, and how to test it. After that, we will definitely include @PennyHow for the final review.

PennyHow commented 8 months ago

Can this PR be closed now that #183 is merged? @ladsmund @RasmusBahbah