Open wwieder opened 2 years ago
@negin513 not critical, but did you ever try applying these masks to the plots of NEON data?
Thanks @wwieder for the reminder. I actually did not see this before. I will work on applying these filters. I am wondering what would be the best way to do this. I think we eventually want these filters for both Bokeh and matplotlib plots so maybe writing a function remove_outliers (or something like that) and call it during pre-processing makes the most sense.
What I originally had in mind for filtering the outlier was using std instead of fixed values. I am not sure which method (using fixed values for each variable vs. using automatic outlier detection methods) works better and it is easier.
For automatic outlier detection, there are other options available as well:
An example of using one-class classification for outlier detection: https://blogs.sap.com/2020/12/29/outlier-detection-with-one-class-classification-using-python-machine-learning-client-for-sap-hana/
I like the function to remove_outliers
. At this stage I'd keep it simple and really obvious what we're doing. Using fixed values or the 3 sigma threshold will hopefully catch the bulk of the crazy spikes in the measurements.
To mask out absurd measurements from NEON data @ddurden recommended using these min and max thresholds that are used in Ameriflux data processing.
@negin513 , it's not urgent but can you bring these thresholds into scripts that plot up NEON observations?
Flags used for Ameriflux data Rng$Min <- data.frame( "FC" = -100, #[umol m-2 s-1] "SC" = -100, #[umol m-2 s-1] "NEE" = -100, #[umol m-2 s-1 "LE" = -500, #[W m-2] "H" = -500, #[W m-2] "USTAR" = 0, #[m s-1] "CO2" = 200, #[umol mol-1] "H2O" = 0, #[mmol mol-1] "WS_1_1_1" = 0, #[m s-1] "WS_MAX_1_1_1" = 0, #[m s-1] "WD_1_1_1" = -0.1, #[deg] "T_SONIC" = -55.0, #[C] )
Rng$Max <- data.frame( "FC" = 100, #[umol m-2 s-1] "SC" = 100, #[umol m-2 s-1] "NEE" = 100, #[umol m-2 s-1] "LE" = 1000, #[W m-2] "H" = 1000, #[W m-2] "USTAR" = 5, #[m s-1] "CO2" = 800, #[umol mol-1] "H2O" = 100, #[mmol mol-1] "WS_1_1_1" = 50, #[m s-1] "WS_MAX_1_1_1" = 50, #[m s-1] "WD_1_1_1" = 360, #[deg] "T_SONIC" = 45.0, #[C] )