Physical parameters despiking test

ggalibert commented 5 years ago

At the last QC summit, Ken Ridgway mentioned he had been successfully used an automated despiking method for his processing/QC of T and S from the NRYON and NRSDAR mooring data.

I have contacted Ken today to see if he could share his algorithm for inclusion in the toolbox.

ggalibert commented 5 years ago

This algorithm will need to be able to cater for burst data which show variability between bursts that we don't want to be flagging. @NickMortimer you suggested you have some idea on how to do this, please feel free to contribute here.

ggalibert commented 5 years ago

From Ken:

The matlab code is attached in ANMN_timeseries_despiking.zip. There are actually 2 files (despiking1, despiking2) with slightly different algorithms. From memory I found that despiking1 produced the best result but I think some further testing would be required before it is implemented in the toolbox. Further details are in the attached paper.

ggalibert commented 5 years ago

sspagnol commented 5 years ago

Comment in the context of applying https://github.com/aodn/imos-toolbox/issues/543#issuecomment-489494959 to burst WQM data.

I have taken the Mori matlab code, used comments made on it, modified ellipsoid function just to return a logical flag; then created two versions of dispiking algorithms based on

extra mean absolute deviation as per Wahl 2002,
always valid test and mean absolute deviation as per Parsheh 2010.

Modified code here

So applied per burst to WQM data (PSAL shown, top plot is application of despike1, bottom plot application of mod mori code. Black circles raw data, colored symbol what is flagged as a spike by the algorithms). Note despike1 algorithm applied with window of 5.

PSAL_view1

And closer burst view.

PSAL_view2

You can see where readings are very tight (and just varying by sensor resolution), both algorithms will flag data, Parsheh more so, but this doesn't look it would affect a calculated burst average (needs to be tested).

As always there is no perfect algorithm that does everything, for example applying to DOXY, it doesn't flag those large vertical drops.

DOXY_view1

Plots shown where typical of spikes found for the WQM data. Some thoughts on

Goring
- despiking1 algorithm has a tuneable window parameter. I just choose 5, is there a way to automatically calculate this.
- Goring flagged less data for the close data burst case.
Parsheh
- Parsheh algorithm has two parameters (c1 and c2, which in literature most people set equal to 1.483).
- Did a better job on multipoint spikes overall.
- flags more data for the close data burst case, but if increasing c1 parameter (flag measurements as valid) to 2.0 helps. But do we have a per variable tuning parameter again. Was trying to find a scale and a parameter free method, maybe not possible.

Things to do

Understand what imosVerticalSpikeQC does and compare.
Maybe create a few synthetic bursts with number of different types of spikes so can get some stats?
If run on non burst data, it would be more applicable to run over sliding window of the data, how to automatically calculate this window?
For burst data, if after flag data, calculate burst average value and apply algorithm against that (and ref point 2).
Could Parsheh tests help improve despike1 method?

Interested to hear what people think.

References Goring, D. G., and Nikora, V. I. 2002. “Despiking acoustic doppler velocimeter data.” J. Hydraul. Eng., 128q1r, 117–126. DOI: 10.1061/(ASCE)0733-9429~2002!128:1~117.

Tony L. Wahl1. 2002. Discussion of ‘‘Despiking Acoustic Doppler Velocimeter Data’’ by Derek G. Goring and Vladimir I. Nikora. Vol. 128, No. 1, pp. 117–126.

Goring and Nokiro. 2002. Closure to ‘‘Depiking Acoustic Doppler Velocimeter Data’’ by Derek G. Goring and Vladimir I. Nikora. January 2002, Vol. 128, No. 1, pp. 117–126.

Mori, N., Suzuki, T., and Kakuno, S. 2007. “Noise of acoustic Doppler velocimeter data in bubbly flows.” J. Eng. Mech., 133q1r, 122–125. DOI: 10.1061/(ASCE)0733-9399(2007)133:1(122)

Parsheh et al, "Estimation of Power Spectra of Acoustic-Doppler Velocimetry Data Contaminated with Intermittent Spikes", 2010, JOURNAL OF HYDRAULIC ENGINEERING , DOI: 10.1061/(ASCE)HY.1943-7900.0000202

ocehugo commented 5 years ago

Question: Is the acceptance criteria of de-spiking set by the user, automatic, or both?

Some points/actions:

[x] Define a function to create synthetic time series with spikes.
[x] Define parameters to this function that represents variable spikes (Say typical spikes in T/S/ChlA/etc.).
[x] Acquire realistic Temperature/Salinity series/ with spikes.
[x] Acquire/re-code/run the de-spiking functionals (comprehensive testing).

Goring/Wahl method appears appealing. However, it may not be applicable to all variables (say Chla) or even be a bit overkill since river measurements are a bit different. Nonetheless, the decision should be strictly on simplicity, quality, and with the smaller set of modified parameters.

petejan commented 5 years ago

The methods Wahl, Parsheh are for acoustic data, the spikes in these are very different than PSAL and TEMP.

What is the source of the spikes in the WQM (or a salinity sensor)?,

fresh water, maybe the measurement is a real measurement of the water salinity, then maybe don't remove the spike,
bio-fouling, something going though the cell, then possibly remove the spike,
cracked cell, probably should mark the entire record as possibly bad.

One needs to understand what the source of the spikes in the signal are before removing them, they maybe real, it depends on how the sensor is mounted, where in the water column is, and which part of the ocean the sensor is.

Are the spikes in the oxygen data also in the salinity? Salinity is needed to calculate the oxygen so the spikes in salinity appear in oxygen, if its from a fresh water outflow then the spikes maybe real.

ocehugo commented 5 years ago

@petejan, I'm not choosing any champion, but the methods described in Goring (his and probably the wavelet threshold) should apply to all variables. Moreover, these methods were tested in river velocity measurements (more turbulence, less SNR). I doubt the SNR of their cases would be higher than T/S/Oxygen or even coastal velocity. Maybe for chlA or reflectance/acoustic/oceanic optical measurement, a different method should be used.

For example, I still think the wavelet threshold was not considered correctly in Goring. This method has a strong statistical inference, and their weak parts about using std instead of mad (as wall pointed out) and their spike substitution point to weak assumptions.

The important thing is to define the typical spikes for each variable, have examples of that (cases), and test some.

petejan commented 5 years ago

The issue with applying any sort of spike test is that its not just variable dependant, but site and depth dependant. Some spikes are real (like rain on surface salinity) and some are sensor related.

ocehugo commented 5 years ago

@sspagnol,

could you provide the time-series examples you posted? (could be the link to the files if they are within AODN or a mat/nc/csv files).
For burst series, are we not using an average/median procedure to get rid of them at every burst window?

@petejan, From your comment, I think you are concerned about the filtering aspect of this de-spiking process. Could you provide time-series cases of your specific concerns? Examples would be handy, like how the precipitation signal looks like and an instrument spike pattern you encountered.

general question: Are we here talking about tagging, filtering, or both?

AFAIK, this task is related to tagging spikes in time-series of high-frequency data so we can put a QC flag on them. In this case, spikes here are impulse-like functions with short duration (tightly bounded in frequency).

Should we be also doing filtering? If so, better to consider uncertainty since filtering can change the entire signal. If we consider uncertainty, we can filter within the uncertainty bounds somehow. An automatic noise level should also be defined.

Hence, there are more potential specific things to be raised as requirements:

[ ] Spike removal need to work/help in uneven time series (bursts)?
[ ] Where in the tree of execution the de-spiking routine will perform best/ be located?
[ ] Tagging spikes in QC?
[ ] Tagging noise in QC?
[ ] Filtering the series for more noise? What hierarchy of variable names to use if filtering?
[ ] Automatic metrics (noise level detection/state space distances)?
[ ] Tests and decision metrics should take uncertainty into account.
[ ] Collect methods names and define meaningful aliases for them - we need to record which method is which and have a lookup table of names to papers/refs.

ggalibert commented 5 years ago

@ocehugo the toolbox only does flagging on FV01 files. Averaging (FV02) should only happen later in the workflow when data has been QC'd.

petejan commented 5 years ago

Some files,

a deep CTD with a 'drop out' in it, its an unpumped sensor, something probably fell through the sensor,

http://thredds.aodn.org.au/thredds/dodsC/IMOS/ABOS/SOTS/2012/IMOS_ABOS-SOTS_CSTZ_20120710_SAZ47_FV01_SAZ47-15-2012-SBE37SM-RS232-4422_END-20131012_C-20180807.nc.html

A file on the surface in the deep ocean which probably has rain spikes (or they could be air bubbles),

http://thredds.aodn.org.au/thredds/dodsC/IMOS/ABOS/SOTS/2015/IMOS_ABOS-SOTS_CSTZ_20150312_SOFS_FV01_SOFS-5-2015-SBE37SM-RS485-03707408-1.01m_END-20160420_C-20180404.nc.html

A deep un-pumped sensor with some spikes which may or may not be real,

http://thredds.aodn.org.au/thredds/dodsC/IMOS/ABOS/SOTS/2017/IMOS_ABOS-SOTS_CSTZ_20170317_SAZ47_FV00_SAZ47-19-2017-SBE37SM-RS232-1000_END-20180311_C-20180504.nc.html

A pumped sensor in the deep ocean, with spike looking events, but are probably just changes in water masses,

http://thredds.aodn.org.au/thredds/dodsC/IMOS/ABOS/SOTS/2017/IMOS_ABOS-SOTS_COSTZ_20170317_SOFS_FV01_SOFS-6-2017-SBE37SMP-ODO-RS232-200_END-20171116_C-20180213.nc.html

ocehugo commented 5 years ago

Sorry about the silence, got stuck in other tasks!

@petejan, before I start digging, could you also provide the rationale for the solution and any corrected versions of the problems (if available)? ATM, I don't care which software or if the correction is manual, I only care about raising requirements and registering possible solutions.

I quickly checked the second file last week. What should be rain spikes appears an easy fix. Didn't have time to look into the others though.

VvanDongenV commented 5 years ago

what petejan is showing are spikes from a SBE37 and not spikes from WQM - which are, to me, 2 distinct matters due to their sampling scheme.

Nevertheless both instruments may show spikes that may need to be flagged on an automatic QC procedure. Here below is an example from Heron Island South mooring (GBRHIS), showing here data from SBE37 for the 24/03/2013 - 04/04/2014 period, and from WQM for the following deployment period (05/04/2014 - 17/10/2014) at the same location. Rain data from a nearby weather station is provided below.

GBRHIS-1303_1404_SBE37_WQM_spikes_example3

There are no specific reason for the presence of such PSAL spikes due to rain here. Also, the dilution of salinity in the ocean due to rain fall might not be seen below the surface - and a single second spike of a one minute burst data from WQM is, to me, unlikely to be due to rain. What I observed though is the drop of salinity due to a cyclone in April 2014 - but this is not seen as spike and instead more like a continued drop in salinity that then recovered.

GBRHIS_1404_overview

Testing distinct despiking methods for this above WQM>PSAL dataset from that GBRHIS-1404 deployment, shows that overall implementing a simple statistical threshold method might be used for automatically QC spikes from burst data. Further details will follow in a document.

GBRHIS_1404_KenRdespiking_trial_timeseries_withMedian std_overview

GBRHIS_1404_KenRdespiking_trial_timeseries_all

Yet, I would certainly suggest to record the formula or at least the name of the 'despiking method' used (if any used) in the corresponding metadata and the nc file. This will be much more objective and traceable for any further reprocessing or for the next product level.

petejan commented 5 years ago

@ocehugo I'm not sure that the rain needs 'fixing' unless your saying that you can create a spike filter which will not be triggered by these spikes.

@VvanDongenV The data looks like valid salinity measurements to me, its not an instrument error but water passing which is fresh and being mixed into the higher salinity water. I would be making these maybe as probably good data, if not good data.

ocehugo commented 5 years ago

@VvanDongenV - thanks for the example information - I will use this period for testing the algorithms. Would you say this SBE37 data is a typical case, a 2std case or a 10std case based on your experience?

The spikes between Oct-Dec are, IMO, simple to remove in several automatic procedures - the acceleration in salinity is unrealistic even for surface measurements in the tropics. The same for the end of March [SBE37 top figures]. I would doubt it's physical in any useful way - A simple 10std range or regional range would filter these easily.

On the other hand, the more continuous and sequential spikes in your last plots (GBRHIS1404 after may 14) appears real for me. The daily acceleration is high, but the sensor is continuously measuring the data at a decreasing and increasing manner. The range is also acceptable, and the data is measured within a low frequency event (a dip in psal).

All the other singletons spikes appear noise but could be limited representations of some physical process - for example, the distribution of spikes is higher in Apr-Jun. Maybe there is some rapid flow around the mooring (tides?) that can flush accumulated freshwater? This appears only partially captured by the sampling scheme.

Nonetheless, I doubt that these spikes would be useful in 95% of the cases. The other 5% of cases are more specialised stuff - people planning better sampling or digging in. I would go with 95% of the users in this case.

@petejan, the question is, where are they useful? Any examples of conclusions based on these isolated spikes? Most people would remove them anyway. It's good to remember that we are not removing anything, just tagging them. I would happily flag them as "rain" and others "instrument noise/spike" if we can robustly do it.

PS1: The above is one case where several layers of qcflags would be useful. PS2: despiking 2 appears good but misses some of them. Maybe it's just a matter of "windowing" it.

petejan commented 5 years ago

@ocehugo for me the question is why remove them if they are a measure of the quantity your trying to measure? Are you just trying to remove them to make the data 'look good'. One reason not to remove them is that it will upset the averages esp in the case where salinity spikes are due to rain and are always towards zero. Surface rain on the ocean is an important dampener of mixing in the ocean.

ocehugo commented 5 years ago

@petejan,

@ocehugo for me the question is why remove them if they are a measure of the quantity your trying to measure? Are you just trying to remove them to make the data 'look good'.

You are not trying to measure a process if the only thing you can get is a single intermittent measurement within some burst. To separate the "surface rain" process, you have to locate it and sample it accordingly, which is not the case of the singleton spikes.

The spiked and low frequent occurrence means an incorrect sampling scheme. The spike is a systematic error/side-effect of physical or instrumentation nature. In this case, tagging allows separation and corrections to be taken. It allows us to be diligent.

One reason not to remove them is that it will upset the averages esp in the case where salinity spikes are due to rain and are always towards zero.

By doing the average with the spikes, you are disregarding the skewness created by the spikes. If the distribution is skewed, the mean is not robust anymore - it doesn't represent the typical value.

Do you want to account spikes as real variability? Use the standard deviation of the burst but the median of the burst as the typical value. This accounts for the skewness. If you use the mean, the computed value will have an uncertainty higher than any individual measurement of the burst.

Another way is to remove the spike and then do the mean. This will correct the skewness and the standard deviation of the burst will be lower. This is usually what we expect when we decide to average data - reduce instrumentation error, variability, and outlier influence.

Smooth data is a good thing - it means the sampling scheme (and QC) was appropriate to the main objective in question.

Any modelling framework also assumes smoothness in some sort - normal non-skewed distributions.

Finally, It's cheaper computationally and practically to add more points to the mean/std estimates.

The rationale to support spike tagging is simple:

If you want to look at the undersampled population of the "surface rain" process, then you are at the edge of the dataset (you have interest in a singleton value measured than the other X made during a burst), THEN
You are doing a different thing than most - within 5% of people. You are looking into the tails of the dataset, beyond the primary sampling objective, with dubious estimates.
Check the QC flags description and select/exclude the ones you want or
Pick up the raw data and take a rabbit from the hat :)

The QC procedure needs to go into a higher level to improve outreach - it's not my call but a community one.

ocehugo commented 4 years ago

Please continue the discussion/requests regarding despiking data for TimeSeries in #661.

ocehugo commented 4 years ago

PS: Ken routines and a new despiking tests are included in the 2.6.6 release. At the moment, it's only for non-burst data, but we are trying to sort it out for burst data in the next release.

petejan commented 4 years ago

Interactive tool for viewing effect of parameters on the spikes,

%cd /Users/pete/ABOS/imos-toolbox
%addpath(genpath('.'))

file='~/cloudStor/SOTS-Temp-Raw-Data/SOFS-2-2011/netCDF/IMOS_ABOS-SOTS_CSTIP_20111104_SOFS_FV01_SOFS-2-2011-SBE37SM-RS485-03707409-2m_END-20120807_C-20200521.nc';

temp = ncread(file, 'TEMP');
time = ncread(file, 'TIME') + datetime(1950,1,1);

f = figure(1); clf
%ax = axes('Parent',f,'position',[0.13 0.39  0.77 0.54]);
plot(time, temp);

global h stdfactor wind

stdfactor = 4.0;
wind = 10;

qc = imosSpikeClassifierHampel(temp, wind, stdfactor);
hold on

h = plot(time(qc), temp(qc), 'or');

fc = figure(2); clf
b = uicontrol('Parent',fc,'Style','slider','Position',[81,54,419,23], 'value', stdfactor, 'min',0, 'max', 20);
bgcolor = f.Color;
bl1 = uicontrol('Parent',fc,'Style','text','Position',[50,54,23,23], 'String','0','BackgroundColor',bgcolor);
bl2 = uicontrol('Parent',fc,'Style','text','Position',[500,54,23,23], 'String','20','BackgroundColor',bgcolor);
bl3 = uicontrol('Parent',fc,'Style','text','Position',[240,35,100,23], 'String','stdfactor','BackgroundColor',bgcolor);

b.Callback = @(es,ed) update_slider(time, temp, es.Value, wind); 

c = uicontrol('Parent',fc,'Style','slider','Position',[81,114,419,23], 'value', wind, 'min',1, 'max', 20);
bgcolor = f.Color;
cl1 = uicontrol('Parent',fc,'Style','text','Position',[50,114,23,23], 'String','1','BackgroundColor',bgcolor);
cl2 = uicontrol('Parent',fc,'Style','text','Position',[500,114,23,23], 'String','20','BackgroundColor',bgcolor);
cl3 = uicontrol('Parent',fc,'Style','text','Position',[240,85,100,23], 'String','window','BackgroundColor',bgcolor);

c.Callback = @(es,ed) update_slider(time, temp, stdfactor, es.Value); 

function update_slider(time, temp, new_stdfactor, new_wind)
    global h stdfactor wind
    [new_stdfactor floor(new_wind)]
    h.delete()
    qc = imosSpikeClassifierHampel(temp, floor(new_wind), new_stdfactor);
    figure(1);
    h = plot(time(qc), temp(qc), 'or');
    stdfactor = new_stdfactor;
    wind = floor(new_wind);
end

data figure spike-data slider figure spike-slider

spike_test_slider.m.zip

ocehugo commented 4 years ago

@petejan - thanks. Yes, that's the idea of the "preview window" I mentioned. I just need to change the current one and adapt a button to reach this kind of thing.

aodn / imos-toolbox

Physical parameters despiking test #543