regel / loudml

Loud ML is the first open-source AI solution for ICT and IoT automation
Other
293 stars 92 forks source link

match_all field applies filters to all features instead of just its own #46

Open daradermody opened 5 years ago

daradermody commented 5 years ago

This issue now tracks the issue where using match_all in a feature applies the filters to all features when querying Elasticsearch. Original comment preserved below.


Is it (or will it be) possible to create and train a model for a single time series, and apply the prediction and anomaly detection to other time series?

For example, if I have create a model that studied the traffic from a single IP address over the course of a day, can I apply that same model to 1000 other IP addresses without having to individually create and train a new model for each (which would be resource heavy). This use case would apply if I knew the different IP addresses should produce very similar traffic patterns.

I tried to do this with the following features:

  "features": {
    "io": [
      {
        "field": "traffic",
        "measurement": "log",
        "metric": "avg",
        "name": "traffic_pattern_to_analyse",
        "match_all": [
          {
            "tag": "ip",
            "value": "192.168.0.123"
          }
        ]
      }
    ],
    "o": [
      {
        "field": "traffic",
        "measurement": "log",
        "metric": "avg",
        "name": "traffic_pattern_to_predict",
        "match_all": [
          {
            "tag": "id",
            "value": "192.168.0.10"
          }
        ]
      }
    ]
  }

I was thinking that the traffic_pattern_to_analyse feature would be used as the input to the model, and the prediction result (and anomaly detection) would be applied to both as outputs. This didn't work because the Elasticsearch query uses the both match_all filters on the entire query, so no results were returned.

Will this be possible in future, or maybe something like what the multi-metric feature in X-pack does where it splits a single time series into multiple time series based on a categorical field?

regel commented 5 years ago

@daradermody

Hi Dara, it's a good catch thanks! 2 things in your comment:

daradermody commented 5 years ago

Hey @regel, thanks for the clarification! #36 seems to be what I'm looking for, so I'll keep an eye on that. Do you want to keep this ticket open to track the match_all bug, or create a separate ticket?

regel commented 5 years ago

the match_all logic in elastic.py will have to be changed to solve this issue, and we need a new unit test into test_elastic.py for better test coverage in this scenario.

regel commented 5 years ago

@daradermody feel free to submit a pull request