Netflix / Surus

Apache License 2.0
458 stars 106 forks source link

Error in AnomalyDetection.rpca #7

Open punsiitg opened 9 years ago

punsiitg commented 9 years ago

Hi,

I am using the R version of Anomaly Detection and am coming across an error. Here are the details:

Error:

AnomalyDetection.rpca(ts) Error in data.frame(X.transform = X.transform, L.transform = L.transform, : arguments imply differing number of rows: 231, 227 In addition: Warning message: In matrix((j - j.global.mean)/j.global.sd, nrow = frequency) : data length [227] is not a sub-multiple or multiple of the number of rows [7]

Here's how to reproduce it:

ts <- c(54,28,56,98,105,3,96,70,56,56,30,48,42,42,70,2,48,63,70,66,99,1,54,112,21,56,3,9,4,8,14,15,2,20,8,8,1,3,8,6,6,10,1,8,9,7,6,9,6,18,8,16,28,30,6,4,32,20,16,16,88,2,6,16,12,12,20,4,2,16,18,14,12,18,2,8,24,30,8,10,81,196,6,32,21,27,8,7,9,9,40,8,40,24,10,6,18,52,120,196,180,27,32,100,112,96,484,2,18,88,78,78,120,18,16,108,91,72,108,10,654,2169,1252,2480,4704,4995,612,174,3328,2940,2424,2504,8008,104,666,2184,1806,1866,2930,380,87,1736,2817,2135,1896,2736,187,18,90,40,80,140,150,30,20,160,100,80,80,132,3,9,24,18,18,30,6,3,24,27,21,18,27,3,117,76,152,252,285,39,4,256,190,144,144,572,2,36,104,102,114,170,34,13,136,162,140,126,171,15,258,513,404,808,1344,1425,270,152,912,1000,736,720,4312,86,168,776,564,576,990,188,86,472,918,728,576,819,89)

AnomalyDetection.rpca(ts)

Kindly let me know how the issue can be resolved. I am stuck on a project due to the above issue.

Thanks.

mmolaro commented 9 years ago

The full signature for AnomalyDetection.rpca is given below: You can see that it assumes the data has a periodicity (called frequency) of 7 unless another value is passed. Your data must be divisible by 7 if you don't pass a value for frequency when calling the function. I'd recommend using frequency=1 if you don't have an expectation of periodicity in your data.

AnomalyDetection.rpca(X, frequency = 7, dates = NULL, autodiff = T, forcediff = F, scale = T, L.penalty = 1, s.penalty = 1.4/sqrt(max(frequency, ifelse(is.data.frame(X), nrow(X), length(X))/frequency)), verbose = F)

elsahxh commented 6 years ago

According to the documentation, 'If X is a vector it will be cast to a matrix of dimension frequency by length(X)/frequency'. Based on my understanding, setting frequency =1 will convert the input into a 1-d array. Does it make sense to perform PCA on 1-d array?

mathurtx commented 6 years ago

Can you pad the values with the median of the previous weeks if the data is not divisible by 7 or the periodicity? Or can you follow a sliding window approach with days as rows and day, week, month values as columns?