Closed vitaly-krugl closed 8 years ago
:+1:
One update to the note above: before switching NAB over, we should make sure
nupic.frameworks.opf.common_models.cluster_params.getScalarMetricWithTimeOfDayAnomalyParams()
contains the latest parameters from NAB! Is there a NuPIC issue for that?
contains the latest parameters from NAB! Is there a NuPIC issue for that?
:+1: hope to see an example in nupic! I got a bit lost how do I feed the metricData
- if I don't have any? I want the best set of params for a generic (black box) usecase. I was under assumption hotgym
is sort of that.
@subutai - regarding
One update to the note above: before switching NAB over, we should make sure nupic.frameworks.opf.common_models.cluster_params.getScalarMetricWithTimeOfDayAnomalyParams() contains the latest parameters from NAB! Is there a NuPIC issue for that?
The intention behind this issue is to do both things:
Since these are tightly-coupled changes, do we need a separate issue in NuPIC for it as well?
@breznak, regarding
hope to see an example in nupic! I got a bit lost how do I feed the metricData - if I don't have any? I want the best set of params for a generic (black box) usecase. I was under assumption hotgym is sort of that.
There is an example of getScalarMetricWithTimeOfDayAnomalyParams()
use here: https://github.com/numenta/numenta-apps/blob/9a02bf984272721f94b1255b81146841bc890433/htmengine/htmengine/runtime/scalar_metric_utils.py#L80-L84
do we need a separate issue in NuPIC for it as well?
@vitaly-krugl Not sure - up to @rhyolight . As long as both issues get taken care of in the right order, I don't really care!
@vitaly-krugl thank you! I'm still trying to understand the concept - is this useful for me if I have: black box data (any, generic use-case), don't do swarming (actually, does this code just run some small swarming to get "best" params?), I don't use time in my model? What params does this set - encoder, anomaly, all for HTM model..?
@breznak, getScalarMetricWithTimeOfDayAnomalyParams
is using this file as the template: https://github.com/numenta/nupic/blob/master/src/nupic/frameworks/opf/common_models/anomaly_params_random_encoder/best_single_metric_anomaly_params.json
and augments it with values based on the args of the function.
The static template was initially derived from swarming over IT data, and we found that it worked well for anomaly detection in time series data for HTM for IT (formerly Grok for IT) as well as HTM for stocks ("Taurus").
Thank you @vitaly-krugl !
Just one last checklist :wink: before we freeze these values for https://github.com/breznak/neural.benchmark/issues/14 (CC @wattik please TODO)
@wattik Probably we'll settle with these values, atleast we'll have comparison with NAB for real data, please fix this before running the tests!
Hi @breznak, answering the questions I can:
is still better to use plain Scalar where possible
Yes, if you know the min/max and they don't change.
have the epsilon differences (0.10000000000000001) some smart meaning?
No, just rounding effects.
Thanks @chetan51 :+1:
Perhaps @subutai can answer the rest?
looks like these values have better foundation than the "hotgym" (NuPIC will switch anyway)
Yes, I believe so. Everyone should start with these parameters unless they have a good reason.
this should be quite good settings even if we use different encoder(s) (diff input layer), right?
Yes, I think these SP and TM parameters are good starting points even if you have a different set of initial encoders.
have you included "tm" implementation to your testing? And still chose (cpp=)TP ?
No we haven't. That would be an interesting thing to test!
@rhyolight is there a nupic issue (and PR) for this?
@BoltzmannBrain I don't believe so, no.
The task here is to change NuPIC's best anomaly params to match the current NAB params (which are believed to be generally the best), and then have the NAB algorithm use the NuPIC json directly rather than keeping a copy in the NAB repo. @subutai does this sound correct?
We still would not require NuPIC to install NAB, only if you want to run the HTM detector.
Yes, that sounds correct (in that order too). @vitaly-krugl mentioned this too earlier in this thread. I guess someone just needs to create the NuPIC issue for this? :smile:
Thanks, will do.
I've updated the nupic params with nab's, and setup the htm detector in nab to pull them in from nupic. I'm finding differences in the resolutions calculated from numBuckets
, resulting in (small) score changes. For example, here is the resolution for "realTraffic/TravelTime_387.csv" under a few setups, the HTM detector using the NAB params...
nupic.frameworks.opf.common_models.cluster_params.getScalarMetricWithTimeOfDayAnomalyParams()
, res = 38.8461538462.metricData
and not the min and max for the get method (so the padding is the std dev), res = 44.9919262275.metricData=[0]
, res13 = 38.8461538462.The logic in 4 makes the most sense. @subutai a nice result is it increases the NAB scores :smile: I'll followup with PRs soon.
NAB PR #206 recently made improvements to its own copy of the model parameters.
Htmengine is relying on nupic.frameworks.opf.common_models.cluster_params.getScalarMetricWithTimeOfDayAnomalyParams() for its model parameters.
NAB should be using and updating nupic.frameworks.opf.common_models.cluster_params.getScalarMetricWithTimeOfDayAnomalyParams instead of maintaining its own copy of the parameters. Similar idea to "Switch over to using the Anomaly Likelihood class in NuPIC” (https://github.com/numenta/NAB/pull/184).
Per email from Subutai: