Marcnuth / AnomalyDetection

Twitter's Anomaly Detection in Pure Python
Apache License 2.0
304 stars 76 forks source link

Best way to handle sec and ms level of granularity #12

Closed hokiegeek2 closed 5 years ago

hokiegeek2 commented 5 years ago

I see in the anomaly_detect_ts the period is not set when the granularity is either 'sec' or 'ms':

elif timediff.seconds > 0:
    granularity = 'sec'
    # Aggregate data to minutely if secondly
    data = data.resample('60s', label='right').sum()
else:
    granularity = 'ms'

This causes the error reported in #11 Questions: should data at the sec and ms level (1) be resampled to 'min' level of granularity and set the period=1440 (2) be unsupported with ValueError being thrown to interrupt processing, (3) provide both as a configurable option

I personally vote for (3). @Marcnuth @triciascully what do you think?

triciascully commented 5 years ago

I vote for (3), and thanks for debugging this!

On a separate but related note, I've been manually running the code and setting the period to period = 96 because the data I'm working with is at a 15min interval - can you build in an option for configuration of the period as well, since people might be working with data that's every 10min, 15min, 30min, 2hrs, etc.?

hokiegeek2 commented 5 years ago

Yep, I can do that! Please put in a issue so it can be tracked.

Thanks

—John

Sent from my iPhone

On Sep 20, 2018, at 4:55 PM, triciascully notifications@github.com wrote:

I vote or (3), and thanks for debugging this!

On a separate but related note, I've been manually running the code and setting the period to period = 96 because the data I'm working with is at a 15min interval - can you build in an option for configuration of the period as well, since people might be working with data that's every 10min, 15min, 30min, 2hrs, etc.?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

hokiegeek2 commented 5 years ago

Adding in code to do the following:

  1. enable resampling of both sec and ms to minute
  2. render resampling of both sec and ms to be configurable so that either the resampling to a minute granularity occurs or a ValueError is thrown
triciascully commented 5 years ago

Ah thank you, John! Sorry for the delay, I've been traveling - do you still need this in an issue?

hokiegeek2 commented 5 years ago

No worries! Yes, please do that so it can be tracked. I am currently testing the fix that resamples data with ms and sec granularity. It's taking awhile because I need to generate it from metricbeat.

hokiegeek2 commented 5 years ago

Okay, pushed pull request.

hokiegeek2 commented 5 years ago

@triciascully this is implemented and tested in the master branch of my fork

hokiegeek2 commented 5 years ago

merged, thanks @Marcnuth