elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.81k stars 8.2k forks source link

[ML] Anomaly detection: Anomaly explorer map showing when job has no reference to location information #125985

Closed alvarezmelissa87 closed 2 years ago

alvarezmelissa87 commented 2 years ago

Describe the bug: The anomaly explorer shows choropleth map of regions when the ML job has no reference to location information - no geo graphical influencers or anything.

Expected behavior: Map should be shown only when job config contains geo info as partition field or influencer.

Screenshots (if relevant):

elasticmachine commented 2 years ago

Pinging @elastic/ml-ui (:ml)

richcollier commented 2 years ago

Example of ML job that is using categorization:

image

No reference to locations, but results in Anomaly Explorer show a map of France:

image

Danouchka commented 2 years ago

Same thing for cisco & juniper routers high count of records No geolocation fields in my records and yet I have locations on France map I have attached a set of documents for testing

Anomaly detection job

{ "job_id": "cj_high_low_log_count_job", "job_type": "anomaly_detector", "job_version": "7.17.0", "create_time": 1644422414565, "model_snapshot_id": "1645128082", "groups": [ "cisco_juniper" ], "description": "", "analysis_config": { "bucket_span": "15m", "detectors": [ { "detector_description": "count partitionfield=\"device.brand\"", "function": "count", "partition_field_name": "device.brand", "detector_index": 0 } ], "influencers": [ "device.brand", "device.name" ] }, "analysis_limits": { "model_memory_limit": "20mb", "categorization_examples_limit": 4 }, "data_description": { "time_field": "@timestamp", "time_format": "epoch_ms" }, .... "datafeed_config": { "datafeed_id": "datafeed-cj_high_low_log_count_job", "job_id": "cj_high_low_log_count_job", "query_delay": "82024ms", "chunking_config": { "mode": "auto" }, "indices_options": { "expand_wildcards": [ "open" ], "ignore_unavailable": false, "allow_no_indices": true, "ignore_throttled": true }, "query": { "bool": { "must": [ { "match_all": {} } ], "filter": [ { "match_phrase": { "event.dataset": "cisco_juniper_logs" } } ], "must_not": [] } }, "indices": [ "filebeat-*" ],....

Anomaly Explorer Maps

Capture d’écran 2022-02-18 à 00 19 52

cisco_juniper_logs.csv

thomasneirynck commented 2 years ago

This is because of a false-positive match in the MapsPlugin#suggestEMSTermJoin function.

The function receives the sampleValues of mlcategory (which are numbers within the 0-50), and matches them to the INSEE code of France departments. These are also numbers within the 0-50 range. See https://maps.elastic.co/#file/france_departments

There may be complimentary ways on how to address this:

Thoughts, @nickpeihl @jsanz @nreese (?)

It'd be nice to strike a more perfect balance, to get Maps to display ASAP, but also to avoid false positive. Not just to fix this issue, but also since the MapsPlugin#suggestEMSTermJoin is expected to support choropleth-mapping in Lens as well. (fwiw - since the Map will be a "suggested chart", there will already by an explicit "confirm" step from the user, similar to (b)

Danouchka commented 2 years ago

To my opinion, this suggestion feature should be switched off because errors will be always possible. In another dataset, i had french "code postal" field but the maps pointed locations in Austin, Dallas

elasticmachine commented 2 years ago

Pinging @elastic/kibana-gis (Team:Geo)

alvarezmelissa87 commented 2 years ago

Hiya @thomasneirynck, @nreese - I think option (c) might be a good solution going forward. Is this something that can be addressed for 8.3? For now, I can prevent this from showing up in the ML plugin by just not showing the map when it's a categorization job.

thomasneirynck commented 2 years ago

@alvarezmelissa87 I believe option (c) is already addressed here: EMS will now omit metadata for fields that have too generic values: https://github.com/elastic/ems-file-service/pull/243

This change should "automatically" show up in 8.2, without any required changes on your end.

alvarezmelissa87 commented 2 years ago

Ah! Thank you! Is this something I can test locally to confirm we don't see the issue anymore? Then I'll be able to close this issue off.

thomasneirynck commented 2 years ago

You can run Kibana on 8.2,8.x or master branch, and they should have the fix.

alvarezmelissa87 commented 2 years ago

With maps work in looks like this is no longer reproducible 🎉