twitter / AnomalyDetection

Anomaly Detection with R
GNU General Public License v3.0
3.56k stars 779 forks source link

Trivial anomalies are NOT detected #67

Open AmitaiPerlstein opened 8 years ago

AmitaiPerlstein commented 8 years ago

x = 1:5000 x[4900:4910] = 3000 AnomalyDetectionVec(x, period=1440, direction = 'both', e_value = T, plot = T)

I get the following disappointing result: $anoms data frame with 0 columns and 0 rows

AmitaiPerlstein commented 8 years ago

Even a single straight forward deviation from a completely flat signal creates unexpected made up anomalies:

y = rep(0, 5000) y[4900]=10 AnomalyDetectionVec(y, period=1440, direction = 'both', e_value = T, plot = T) $anoms index anoms expected_value 1 580 0 4 2 2020 0 4 3 3460 0 5 4 4900 10 5

AmitaiPerlstein commented 8 years ago

Most surprising: Even when the expected values (returned by the parameter e_value = T), $anoms returns "anomalies" whose expected values are _identical _to the originally supplied ones:

z = rep(0,3000) z[2900] = -5 AnomalyDetectionVec(z, period=1440, direction = 'both', e_value = T, plot = T) $anoms index anoms expected_value 1 1291 0 0 2 1292 0 0 3 1293 0 0 4 1294 0 0 ...

JasonAizkalns commented 8 years ago

@AmitaiPerlstein perhaps you can can contain each of the examples in one post and format the code as you do on your SO post? In addition, you may want to change your title to "Trivial anomalies are not being detected."

samuel-liyi commented 8 years ago

I think this is due to the test this alogrithm is using, it does not handle when the data has zero sigma: https://github.com/twitter/AnomalyDetection/blob/master/R/detect_anoms.R

# protect against constant time series
data_sigma <- func_sigma(data[[2L]])
if(data_sigma == 0) 
       break

maybe in this case ,there should be a more intuitive result instead of an empty data_frame