Create synthetic anomaly dataset for different classes of anomalies

Define, create and include synthetic datasets for different kinds of anomalies. This is important for regressions, as the simple data can stress (at different difficulties) certain properties of HTM. I will also help to define concrete advantages and weak spots of HTM.

Note: not accepted for NAB, so moving from there https://github.com/numenta/NAB/issues/217

[ ] review papers on (synthetic) anomaly datasets/classes of anomalies
[ ] reach for review/ideas
- [ ] ResearchGate
- [ ] HTM ML
- [ ] other forums, where???
[ ] collect and describe all "theoretical classes" of anomalies
- 1D
- [ ] point anomaly
- [ ] amplitude shift
- [ ] phase shift
- [ ] frequency shift
- [ ] noisy
- [ ] combination of the above
- [ ] generating distribution change
- nD
- [ ] de/correlated variables (multi modal input)
- by input
- [ ] data with "holes"
- [ ] "tricky" data, designed to look similar (overlapping sequences,..)
- [ ] auto-tuning on "far data", eg each 1000th is A, then num 121000 is B instead of A.
[ ] generate data
- [ ] synthetic data for each of the classes
- [ ] a well known published dataset on given class, for comparison with other algs.
Theoretical challenges
- [ ] Different kinds of anomalies
  We want to detect all as anomalies, but we may want to differentiate among them. An examples is in the ECG MIT-BIH data, where there are _V_etricular anomalies (easy) and about 4 more types. This somewhat combines anomaly detection with classification of sequences.
- [ ] Scope!
  For example temperature. Measured every morning, 7am I get a relatively stable, slow changing pattern; measured every hour I get stable pattern with significant changes; measured every 7h it looks like random data.
  So the question is, how can HTM "decide" optimal aggregation, focus scale? An example, GPS position reported every second, how do you scale?
- [ ] Model auto-adaptation
  Should all of these be part of one HTM/anomaly model? Or run as an ensemble of specific models?
- [ ] Anomaly prediction
  Yes, it's an oxymoron, but everybody wants it! :icecream: I think this is a core problem, my ideas include running combination HTM of different HTM models (with different scale) ...

breznak / neural.benchmark

Create synthetic anomaly dataset for different classes of anomalies #2