twitter / AnomalyDetection

Anomaly Detection with R
GNU General Public License v3.0
3.56k stars 779 forks source link

Removing leading NA's and subtracting the median #19

Open asstergi opened 9 years ago

asstergi commented 9 years ago

Hi guys,

I came across the package which looks great. I have the following 2 questions on the code in 'detect_anoms.R':

1) In line 51, any leading NA's are replaced by 1. Shouldn't it be 0 (zero)?

2) In line 37, the median is subtracted from the data. In lines 72-80, the median is subtracted again. Is this correct?

I don't know the details of 'S-H-ESD' algorithm, so excuse me if I'm wrong!

Thanks!

owenvallis commented 9 years ago

Thanks for checking out the package.

Line 37: We use a modification to STL which removes the median and the STL seasonal component from the original data. This is how we derive the residual used in the ESD section.

Lines 72-80: This is how we calculate Grubbs' test statistic. We replaced mean and standard deviation with median and median absolute deviation.

Line 51 was done a while back, so we'll go back and look at it again.

owenvallis commented 9 years ago

@asstergi The 1s were being inserted in order to avoid plotting errors when we log transformed the y axis. However, @ahardjasa submitted a patch that now allows us to support the log transform with 0s in the data.

I updated the code to reflect this, and will submit a patch soon.

Cheers,