twitter / AnomalyDetection

Anomaly Detection with R
GNU General Public License v3.0
3.55k stars 776 forks source link

period problem with AnomalyDetectionTs #45

Open blatoo opened 9 years ago

blatoo commented 9 years ago

Hi everybody,

After successfully running the example, I created an own data set, which has the same format like raw_data, I create an myData, which has the same structure as the raw_data. But there still two places are a little different

It looks like: 1 1970-01-01 01:00:55 NA 2 1970-01-01 01:00:10 NA 3 1970-01-01 01:00:25 2.871 4 1970-01-01 01:00:40 2.654 5 1970-01-01 01:00:55 3.060 6 1970-01-01 01:00:10 9.074

after I run the same command like the example:

res = AnomalyDetectionTs(myData, max_anoms=0.02, direction='both', plot=TRUE)

I got the error message:

Error in detect_anoms(all_data[[i]], k = max_anoms, alpha = alpha, num_obs_per_period = period, : must supply period length for time series decomposition

How can I fix this problem?

If I don't know the period, can I still find the anomalies?

Thanks very much for the great work!

Best Regards

Conny

owenvallis commented 9 years ago

Hi Conny,

I would suggest trying the AnomalyDetectionVec function instead of the TS function. At the moment, the TS function aggregates secondly data into minutely data. The Vec function simply takes a list of values, and then treats them as a time series without the timestamp column. A few things might help when using the Vec function:

Hope that helps.

blatoo commented 9 years ago

Hi Owenvallis,

thanks very much for the answer! But I still have another stupid question, what is ESD?

terrytangyuan commented 9 years ago

@blatoo ESD stands for Seasonal Hybrid ESD (S-H-ESD), which is the primary algorithm of this package.

blatoo commented 9 years ago

Hi @terrytangyuan , Thanks very much!!!

terrytangyuan commented 8 years ago

Could anyone close this? Thanks.

tintojames commented 8 years ago

Hey I also ran into the same issue even though my input time series is having a regular interval. Please note I haven't used any NA values. Looks like this issue is still open.

evanhenry commented 7 years ago

Hello, I am experiencing a similar error with 1 Hz data. Have there been any developments on this issue since feb?

`> str(data) 'data.frame': 3600 obs. of 2 variables: $ V1: POSIXct, format: "2016-10-29 07:00:00" "2016-10-29 07:00:01" "2016-10-29 07:00:02" ... $ V2: num 28.7 28.7 28.7 28.7 28.7 ... head(data) V1 V2 1 2016-10-29 07:00:00 28.69 2 2016-10-29 07:00:01 28.69 3 2016-10-29 07:00:02 28.70 4 2016-10-29 07:00:03 28.70 5 2016-10-29 07:00:04 28.70 6 2016-10-29 07:00:05 28.71

data_anomaly = AnomalyDetectionTs(data, max_anoms=0.02, direction="pos", plot=TRUE, e_value = T) Error in detect_anoms(all_data[[i]], k = max_anoms, alpha = alpha, num_obs_per_period = period, : must supply period length for time series decomposition`

jj7353 commented 7 years ago

This worked for me.

res = AnomalyDetectionVec(group_prof_10252016[,2], max_anoms=0.02, period=1440, direction='both', only_last=FALSE, plot=TRUE)

On Sat, Oct 29, 2016 at 3:38 PM, Evan Henry notifications@github.com wrote:

I also noticed that changing the time period in the sample data and code here results in the same error: https://github.com/pablo14/ anomaly_detection_post

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/twitter/AnomalyDetection/issues/45#issuecomment-257114646, or mute the thread https://github.com/notifications/unsubscribe-auth/AVIHHWZA1QxiW0IxPRL20LKJjqhzqMTBks5q4662gaJpZM4FSJhj .

aaishaosman commented 7 years ago

Hi all, I am quite new to this package and would like to use it for some analysis i am doing. I have data that is not regular ie. trading. Would i be able to use the AnomalyDetection to identify say irregular rices charged? If so, what would i set the "period" to, as on some days there might be a trade every second, or hour, and on some days none? i have data for roughly a year.

Any help will be greatly appreciated! Thanks!

guiyang882 commented 6 years ago

hi jj7353, I want to know about the parameter period, why you choose the period = 1440, how to choose this parameter rightly? @jj7353

thx.

Maryoda2 commented 6 years ago

In the example of raw_data, it is by minute. So 1440 because of 24 hrs * 60 minutes whiwh is equal to 1440