elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.69k stars 8.24k forks source link

[Inventory] Alerts not matching to K8s entities [stub] #202355

Open roshan-elastic opened 2 days ago

roshan-elastic commented 2 days ago

Description

Alerts grouped by the k8s entity ID are not showing against the k8s entities in the Inventory (or showing wrong):

Steps to Replicate

  1. Incorrect alerts showing

https://github.com/user-attachments/assets/689858b3-d85b-49b5-9dd7-d4fc98450378

  1. Alerts missing

https://github.com/user-attachments/assets/90bce184-e706-4dad-a804-553694fc7c71

elasticmachine commented 2 days ago

Pinging @elastic/obs-ux-infra_services-team (Team:obs-ux-infra_services)

roshan-elastic commented 2 days ago

cc @kpatticha - You can test this on edge-logs if you want to quickly replicate

jennypavlova commented 26 minutes ago

Hi @roshan-elastic I am trying to understand the issue better now and tried it on the edge logs cluster.

First issue:

To summarize in some cases (like the k8s.cluster.ecs) we see fewer alerts in inventory and when we click on the alerts link we see more alerts because of the filter ( we have in the filter field only "edge-log" and on the rule page you showed the filter is orchestrator.cluster.name: "edge-log" ) so when we get the count we use the filter and when we navigate to the alerts we have only the value ( "edge-log" ) set in the search bar:

Inventory -> Alerts page after count click Is that the expected filter?
Image Image Image

I am asking this because I saw that we removed the field on purpose as part of https://github.com/elastic/kibana/pull/202188 (last table entry in the PR description) so do we want that back to fix this or there is something else I am missing here 🤔 ?


Second issue:

I checked the data and compared the rules and the only difference is that it is using the metrics dataview and not the logs one we have for the cluster rule. I am not super familiar with the custom threshold rules but it should work in theory with different dataviews (we have the correct mapping 'k8s.deployment.ecs' => 'kubernetes.deployment.name' in inventory so I assume something is wrong with the filtering. I tried to execute both queries to the alerts indices we use in inventory to get the alerts count and the host one for example returns the result but the deployment one doesn't:

Image

But on the alerts page I see the alerts:

Image

âš  Not with the field filter (kubernetes.deployment.name : *) tho:

Image

So maybe this is something to check with the @elastic/response-ops team 🤔 : This is the alert rule: logs cluster link) (it's also shown in the second video in the description.

The queries from the screenshot (click here) #no results GET .alerts-*/_search {"size":0,"track_total_hits":false,"query":{"bool":{"filter":[{"term":{"kibana.alert.status":"active"}}]}},"aggs":{"k8s.deployment.ecs":{"composite":{"size":500,"sources":[{"kubernetes.deployment.name":{"terms":{"field":"kubernetes.deployment.name"}}}]}}}} #returnes results GET .alerts-*/_search {"size":0,"track_total_hits":false,"query":{"bool":{"filter":[{"term":{"kibana.alert.status":"active"}}]}},"aggs":{"host":{"composite":{"size":500,"sources":[{"host.name":{"terms":{"field":"host.name"}}}]}}}}