twitter / AnomalyDetection

Anomaly Detection with R
GNU General Public License v3.0
3.56k stars 779 forks source link

Setting e_value=T causes "differing number of rows" error #46

Open MarkEdmondson1234 opened 9 years ago

MarkEdmondson1234 commented 9 years ago

Hi, great package.

However, when trying to extract the expected values from my dataset, I get this error:

## a_data holds daily count observations
> str(a_data)
'data.frame':   30 obs. of  2 variables:
 $ date  : POSIXct, format: "2013-01-15 01:00:00" "2013-01-16 01:00:00" "2013-01-17 01:00:00" ...
 $ metric: num  192 123 196 193 172 195 123 158 103 115 ...

## works
> AnomalyDetectionTs(a_data, max_anoms=0.02, direction='both')
$anoms
   timestamp anoms
1 2013-01-20   195

$plot
NULL

## error
> AnomalyDetectionTs(a_data, max_anoms=0.02, direction='both', e_value = T)
Error in data.frame(timestamp = all_anoms[[1]], anoms = all_anoms[[2]],  : 
  arguments imply differing number of rows: 1, 0

The same command works fine with the demo raw_data in the package

> AnomalyDetectionTs(raw_data, max_anoms=0.02, direction='both', e_value=T)
$anoms
              timestamp    anoms expected_value
1   1980-09-25 16:05:00  21.3510            129
2   1980-09-29 06:40:00 193.1036             97
3   1980-09-29 21:44:00 148.1740             96
...

> str(raw_data)
'data.frame':   14398 obs. of  2 variables:
 $ timestamp: POSIXlt, format: "1980-09-25 14:01:00" "1980-09-25 14:02:00" "1980-09-25 14:03:00" ...
 $ count    : num  182 176 184 178 165 ...

Here is a copy of my data used above (limited to 30 rows). The original data is 900 observations.

                  date (none)
1  2013-01-15 01:00:00    192
2  2013-01-16 01:00:00    123
3  2013-01-17 01:00:00    196
4  2013-01-18 01:00:00    193
5  2013-01-19 01:00:00    172
6  2013-01-20 01:00:00    195
7  2013-01-21 01:00:00    123
8  2013-01-22 01:00:00    158
9  2013-01-23 01:00:00    103
10 2013-01-24 01:00:00    115
11 2013-01-25 01:00:00    138
12 2013-01-26 01:00:00     95
13 2013-01-27 01:00:00    121
14 2013-01-28 01:00:00    143
15 2013-01-29 01:00:00    118
16 2013-01-30 01:00:00    110
17 2013-01-31 01:00:00    107
18 2013-02-01 01:00:00    120
19 2013-02-02 01:00:00     91
20 2013-02-03 01:00:00     93
21 2013-02-04 01:00:00    149
22 2013-02-05 01:00:00    112
23 2013-02-06 01:00:00    109
24 2013-02-07 01:00:00    109
25 2013-02-08 01:00:00     90
26 2013-02-09 01:00:00     74
27 2013-02-10 01:00:00     85
28 2013-02-11 01:00:00    113
29 2013-02-12 01:00:00    107
30 2013-02-13 01:00:00    110
odinuv commented 9 years ago

I also run into this, i sent the above pull-request. @MarkEdmondson1234 it works with your sample data too.