ioos / ioos_qc

:ballot_box_with_check: :ocean: IOOS QARTOD and other Quality Control tests implemented in Python
https://ioos.github.io/ioos_qc/
Apache License 2.0
46 stars 27 forks source link

Add Neighbor test #37

Open jessicaaustin opened 3 years ago

jessicaaustin commented 3 years ago

This test compares sensor A to a co-located or sufficiently close sensor B measuring the same variable.

Note: There is also the Multi-variate test, that compares nearby sensors measuring different variables that should be related, like sea water temperature and salinity

From the QARTOD pH Manual:

Selection_891

And the QARTOD T/S Manual:

Selection_892

In summary:

jessicaaustin commented 3 years ago

I've added a proposed test algorithm above that's based on the description in the QARTOD manual.

That said, I think there's also a need for a "neighbor test" that compares the actual values for the neighbor to the values for the sensor being tested? For example, here's a use case from @kbailey-noaa

A NOS/CO-OPS rep just alerted me that 2 AOOS partner stations in Valdez (Valdez Duck Flats and Valdez Marine Terminal) were measuring air pressure ~ 10 mb lower than nearby stations.

Turns out the partner hadn't corrected the data to sea level, and it was subsequently fixed at the station ~10/6. But I'm concerned because data were riding 10 mb low for however long, yet showing as 'passed' in AOOS QC tests: https://sensors.ioos.us/#metadata/100911/station/9/sensor

I assume this is because those QC tests don't include neighbor checks. I know there isn't a QARTOD manual for air pressure, and the neighbor test is usually only 'suggested' anyway. But for baro this neighbor test is sometimes the only way to catch suspect data. Is there any consideration for implementing neighbor checks?

Baro data tend to drift if the sensor isn't periodically calibrated, and it's concerning to think we'd have no way of catching that. For these 2 stations, the nearby CO-OPS valdez station should provide a useful comparison.

In this case, it sounds like the offset was constant, so the neighbor check described in the manual wouldn't catch it. However if we had a check where the algorithm was something like, avg(abs(s1 - s2)) > thresh over some time range, then that would catch it right?

This proposed variant of the neighbor test would have similar params as the attenuated signal test