elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.69k stars 8.24k forks source link

[TSDB] Counter metrics not being found in any visualization #152467

Closed constanca-m closed 1 year ago

constanca-m commented 1 year ago

Elastic package version: 8.7.0-SNAPSHOT (latest) Also tried 8.8.0-SNAPSHOT

Problem: All the counter metric fields are unavailable to use in visualizations.

Example: In this situation, I am using the field kubernetes.controllermanager.process.cpu.sec that holds data as it can be seen here in Discover:

image

However, if I try to edit the visualization and try to filter that metric, there are no available fields: image

If I try to enforce its use by specifying the value in the formula, I can obtain some results: image However, the error message persists.

This is going on with all counter metrics fields when using TSDB.

Index template being used ```json { "template": { "settings": { "index": { "lifecycle": { "name": "metrics" }, "mode": "time_series", "codec": "best_compression", "routing": { "allocation": { "include": { "_tier_preference": "data_hot" } } }, "mapping": { "total_fields": { "limit": "10000" } }, "time_series": { "end_time": "2023-03-01T17:35:08.000Z", "start_time": "2023-03-01T13:35:08.000Z" }, "final_pipeline": ".fleet_final_pipeline-1", "query": { "default_field": [ "cloud.account.id", "cloud.availability_zone", "cloud.instance.id", "cloud.instance.name", "cloud.machine.type", "cloud.provider", "cloud.region", "cloud.project.id", "cloud.image.id", "container.id", "container.image.name", "container.name", "host.architecture", "host.hostname", "host.id", "host.mac", "host.name", "host.os.family", "host.os.kernel", "host.os.name", "host.os.platform", "host.os.version", "host.os.build", "host.os.codename", "host.type", "kubernetes.pod.name", "kubernetes.pod.uid", "kubernetes.namespace", "kubernetes.node.name", "kubernetes.node.hostname", "kubernetes.replicaset.name", "kubernetes.deployment.name", "kubernetes.statefulset.name", "kubernetes.container.name", "kubernetes.container.image", "kubernetes.controllermanager.verb", "kubernetes.controllermanager.code", "kubernetes.controllermanager.method", "kubernetes.controllermanager.host", "kubernetes.controllermanager.name", "kubernetes.controllermanager.name_fingerprint", "kubernetes.controllermanager.zone", "kubernetes.controllermanager.kubernetes.controllermanager.zone_fingerprint", "ecs.version", "service.address", "service.type", "orchestrator.cluster.name", "orchestrator.cluster.url" ] }, "default_pipeline": "metrics-kubernetes.controllermanager-2.1.24", "routing_path": [ "kubernetes.controllermanager.name_fingerprint", "kubernetes.controllermanager.verb", "kubernetes.controllermanager.method", "kubernetes.controllermanager.code", "service.address", "orchestrator.cluster.url", "kubernetes.controllermanager.kubernetes.controllermanager.zone_fingerprint", "kubernetes.controllermanager.host" ] } }, "mappings": { "_meta": { "managed_by": "fleet", "managed": true, "package": { "name": "kubernetes" } }, "dynamic_templates": [ { "container.labels": { "path_match": "container.labels.*", "match_mapping_type": "string", "mapping": { "type": "keyword" } } }, { "kubernetes.labels.*": { "path_match": "kubernetes.labels.*", "mapping": { "type": "keyword" } } }, { "kubernetes.annotations.*": { "path_match": "kubernetes.annotations.*", "mapping": { "type": "keyword" } } }, { "kubernetes.selectors.*": { "path_match": "kubernetes.selectors.*", "mapping": { "type": "keyword" } } }, { "kubernetes.controllermanager.client.request.duration.us.bucket.*": { "path_match": "kubernetes.controllermanager.client.request.duration.us.bucket.*", "match_mapping_type": "long", "mapping": { "type": "long" } } }, { "kubernetes.controllermanager.client.request.size.bytes.bucket.*": { "path_match": "kubernetes.controllermanager.client.request.size.bytes.bucket.*", "match_mapping_type": "long", "mapping": { "type": "long" } } }, { "kubernetes.controllermanager.client.response.size.bytes.bucket.*": { "path_match": "kubernetes.controllermanager.client.response.size.bytes.bucket.*", "match_mapping_type": "long", "mapping": { "type": "long" } } }, { "strings_as_keyword": { "match_mapping_type": "string", "mapping": { "ignore_above": 1024, "type": "keyword" } } } ], "date_detection": false, "properties": { "@timestamp": { "type": "date" }, "cloud": { "properties": { "account": { "properties": { "id": { "type": "keyword", "ignore_above": 1024 } } }, "availability_zone": { "type": "keyword", "ignore_above": 1024 }, "image": { "properties": { "id": { "type": "keyword", "ignore_above": 1024 } } }, "instance": { "properties": { "id": { "type": "keyword", "ignore_above": 1024 }, "name": { "type": "keyword", "ignore_above": 1024 } } }, "machine": { "properties": { "type": { "type": "keyword", "ignore_above": 1024 } } }, "project": { "properties": { "id": { "type": "keyword", "ignore_above": 1024 } } }, "provider": { "type": "keyword", "ignore_above": 1024 }, "region": { "type": "keyword", "ignore_above": 1024 } } }, "container": { "properties": { "id": { "type": "keyword", "ignore_above": 1024 }, "image": { "properties": { "name": { "type": "keyword", "ignore_above": 1024 } } }, "name": { "type": "keyword", "ignore_above": 1024 } } }, "data_stream": { "properties": { "dataset": { "type": "constant_keyword" }, "namespace": { "type": "constant_keyword" }, "type": { "type": "constant_keyword" } } }, "ecs": { "properties": { "version": { "type": "keyword", "ignore_above": 1024 } } }, "event": { "properties": { "agent_id_status": { "type": "keyword", "ignore_above": 1024 }, "ingested": { "type": "date", "format": "strict_date_time_no_millis||strict_date_optional_time||epoch_millis" } } }, "host": { "properties": { "architecture": { "type": "keyword", "ignore_above": 1024 }, "containerized": { "type": "boolean" }, "domain": { "type": "keyword", "ignore_above": 1024 }, "hostname": { "type": "keyword", "ignore_above": 1024 }, "id": { "type": "keyword", "ignore_above": 1024 }, "ip": { "type": "ip" }, "mac": { "type": "keyword", "ignore_above": 1024 }, "name": { "type": "keyword", "ignore_above": 1024 }, "os": { "properties": { "build": { "type": "keyword", "ignore_above": 1024 }, "codename": { "type": "keyword", "ignore_above": 1024 }, "family": { "type": "keyword", "ignore_above": 1024 }, "kernel": { "type": "keyword", "ignore_above": 1024 }, "name": { "type": "keyword", "ignore_above": 1024, "fields": { "text": { "type": "text" } } }, "platform": { "type": "keyword", "ignore_above": 1024 }, "version": { "type": "keyword", "ignore_above": 1024 } } }, "type": { "type": "keyword", "ignore_above": 1024 } } }, "kubernetes": { "properties": { "container": { "properties": { "image": { "type": "keyword", "ignore_above": 1024 }, "name": { "type": "keyword", "ignore_above": 1024 } } }, "controllermanager": { "properties": { "client": { "properties": { "request": { "properties": { "count": { "type": "long", "time_series_metric": "counter" }, "duration": { "properties": { "us": { "properties": { "count": { "type": "long", "time_series_metric": "counter" }, "sum": { "type": "long", "meta": { "unit": "micros" }, "time_series_metric": "counter" } } } } }, "size": { "properties": { "bytes": { "properties": { "count": { "type": "long", "time_series_metric": "counter" }, "sum": { "type": "long", "meta": { "unit": "byte" }, "time_series_metric": "counter" } } } } } } }, "response": { "properties": { "size": { "properties": { "bytes": { "properties": { "count": { "type": "long", "time_series_metric": "counter" }, "sum": { "type": "long", "meta": { "unit": "byte" }, "time_series_metric": "counter" } } } } } } } } }, "code": { "type": "keyword", "time_series_dimension": true }, "host": { "type": "keyword", "time_series_dimension": true }, "kubernetes": { "properties": { "controllermanager": { "properties": { "zone_fingerprint": { "type": "keyword", "time_series_dimension": true } } } } }, "leader": { "properties": { "is_master": { "type": "boolean" } } }, "method": { "type": "keyword", "time_series_dimension": true }, "name": { "type": "keyword", "ignore_above": 1024 }, "name_fingerprint": { "type": "keyword", "time_series_dimension": true }, "node": { "properties": { "collector": { "properties": { "count": { "type": "long", "time_series_metric": "gauge" }, "eviction": { "properties": { "count": { "type": "long", "time_series_metric": "counter" } } }, "health": { "properties": { "pct": { "type": "long", "time_series_metric": "gauge" } } }, "unhealthy": { "properties": { "count": { "type": "long", "time_series_metric": "gauge" } } } } } } }, "process": { "properties": { "cpu": { "properties": { "sec": { "type": "double", "time_series_metric": "counter" } } }, "fds": { "properties": { "max": { "properties": { "count": { "type": "long", "time_series_metric": "gauge" } } }, "open": { "properties": { "count": { "type": "long", "time_series_metric": "gauge" } } } } }, "memory": { "properties": { "resident": { "properties": { "bytes": { "type": "long", "meta": { "unit": "byte" }, "time_series_metric": "gauge" } } }, "virtual": { "properties": { "bytes": { "type": "long", "meta": { "unit": "byte" }, "time_series_metric": "gauge" } } } } }, "started": { "properties": { "sec": { "type": "double", "time_series_metric": "gauge" } } } } }, "verb": { "type": "keyword", "time_series_dimension": true }, "workqueue": { "properties": { "adds": { "properties": { "count": { "type": "long", "time_series_metric": "counter" } } }, "depth": { "properties": { "count": { "type": "long", "time_series_metric": "gauge" } } }, "longestrunning": { "properties": { "sec": { "type": "double", "time_series_metric": "gauge" } } }, "retries": { "properties": { "count": { "type": "long", "time_series_metric": "counter" } } }, "unfinished": { "properties": { "sec": { "type": "double", "time_series_metric": "gauge" } } } } }, "zone": { "type": "keyword", "ignore_above": 1024 } } }, "deployment": { "properties": { "name": { "type": "keyword", "ignore_above": 1024 } } }, "namespace": { "type": "keyword", "ignore_above": 1024 }, "node": { "properties": { "hostname": { "type": "keyword", "ignore_above": 1024 }, "name": { "type": "keyword", "ignore_above": 1024 } } }, "pod": { "properties": { "ip": { "type": "ip" }, "name": { "type": "keyword", "ignore_above": 1024 }, "uid": { "type": "keyword", "ignore_above": 1024 } } }, "replicaset": { "properties": { "name": { "type": "keyword", "ignore_above": 1024 } } }, "statefulset": { "properties": { "name": { "type": "keyword", "ignore_above": 1024 } } } } }, "orchestrator": { "properties": { "cluster": { "properties": { "name": { "type": "keyword", "ignore_above": 1024 }, "url": { "type": "keyword", "time_series_dimension": true } } } } }, "service": { "properties": { "address": { "type": "keyword", "time_series_dimension": true }, "type": { "type": "keyword", "ignore_above": 1024 } } } } }, "aliases": {} } } ```
elasticmachine commented 1 year ago

Pinging @elastic/kibana-visualizations @elastic/kibana-visualizations-external (Team:Visualizations)

dej611 commented 1 year ago

Maybe related to #150954 . I think that has been reverted.

stratoula commented 1 year ago

Yes we decided to hide them for 8.7 cc @thomasneirynck

They will be re-enabled on 8.8

thomasneirynck commented 1 year ago

They will be re-enabled on 8.8

Elasticsearch is working on a new approach, one in which field-caps will start publishing which aggregations are supported for a given field https://github.com/elastic/elasticsearch/issues/93539#issuecomment-1438017792

This will require corresponding work on the Kibana-side to take advantage of. TBD

lalit-satapathy commented 1 year ago

Summarising the behaviour of counter metric types in Lens visualisation when TSDB enabled, seems like a blocker. 
 
The behaviour can be reproduced using below steps:


Counter fields don’t show in Visualisation (Only gauge fields do)

Screenshot 2023-04-06 at 10 51 31 AM

Sometimes even in the Visualise link is missing.

Screenshot 2023-04-06 at 10 51 11 AM

We need a confirmation on this issue from Lens team and ETA for the same.

This seems as a duplicate of issue

@agithomas @ruflin

ruflin commented 1 year ago

Trying to get my head around this. Does this mean that even if we as integrations developer know which counter fields we can and should use in aggregations, Kibana blocks us from using it? If yes, it basically means we can't use counters in the current version for visualisation?

ruflin commented 1 year ago

@agithomas @lalit-satapathy If we remove the counter property from the field, I assume everything works as expected?

agithomas commented 1 year ago

@agithomas @lalit-satapathy If we remove the counter property from the field, I assume everything works as expected?

The problems that are observed so far are limited to counter type field. Alternative is to either remove the mapping or change as gauge type.

lalit-satapathy commented 1 year ago

If yes, it basically means we can't use counters in the current version for visualisation?

Is 8.7.0 release yes. The code base keeps changing in snapshot versions (8.7/8.8) but we saw the issue in snapshots also. TSDB has to be enabled as the steps given here.

ppisljar commented 1 year ago

in 8.8 snapshoits counter fields should show up correctly, can you confirm that this issue is still reproducable on latest 8.8 snapshot ?

stratoula commented 1 year ago

Locally you can install the sample data logs (TSDB) and they are present in Lens

image
lalit-satapathy commented 1 year ago

Will check again on 8.8 snapshot. Assuming this issue is confirmed for 8.7.0? Any possibility to backport a fix to 8.7.*?

stratoula commented 1 year ago

Yes as I mention above it was a business decision to hide them on 8.7 https://github.com/elastic/kibana/issues/152467#issuecomment-1450441832

I don't think we want to backport at this point, I am pretty sure it depends on ES too and there are plenty PRs that enbale them on 8.8 so doesnt sound something possible cc @thomasneirynck

martijnvg commented 1 year ago

I am pretty sure it depends on ES too

We merged two changes that are also in 8.7.0. Which is expanding the support for counter field to more aggregations than just rate aggregation and the allowing all aggregations on counter fields for indices that are not defined as time series index.

There is one PR open that enhances the field caps to include what aggregation is supported on what field. This hasn't been merged and it is under discussion. But is viewed as the long term solution.

So afaik there shouldn't be an Elasticsearch reason why these changes can't be back ported to 8.7 branch. But maybe I'm wrong here, so let us know what we can do here to help. From what I understand not being able to use counter fields at all (versus not auto completing counter fields (which I thought was the 8.7 workaround)) is blocking integrations from even considering tsdb on version 8.7.

stratoula commented 1 year ago

@ppisljar from the kibana side it means that we need to backport:

This is a business decision cc @timductive @thomasneirynck

timductive commented 1 year ago

As @stratoula mentioned this was an intentional decision for 8.7. The TSDB rate aggregation was released as technical preview for Elasticsearch and kibana support for this aggregation has always been planned for 8.8. Because we found the additional issues that stratoula mentioned post-ff then we had to remove these fields temporarily.

At this point the risk assessment for backporting these PRs to the patch release is high not only for the known issues but that the 8.7 environment has not been tested thoroughly for supporting this new aggregation. I would recommend not using tsdb timeseries mode for this rate aggregation in 8.7.

If there is an additional regression or urgency to raise this to the level of blocker please let us know but this currently doesn't fit the definition of blocker to me.

lalit-satapathy commented 1 year ago

Will check again on 8.8 snapshot. Assuming this issue is confirmed for 8.7.0? Any possibility to backport a fix to 8.7.*?

I am able to see all the fields (including counters) in 8.8.0-SNAPSHOT run. @constanca-m @ritalwar Can you confirm your run behaviours?

Screenshot 2023-04-11 at 10 58 45 AM
lalit-satapathy commented 1 year ago

Will check again on 8.8 snapshot. Assuming this issue is confirmed for 8.7.0? Any possibility to backport a fix to 8.7.*?

Hi Kibana Team,

We are testing the 8.8.0 snapshot behaviour wrt. the counter fields. One issue seen in discover is, unlike gauge fields counter fields are shown as "Analysis is not available for this field." However going through the Visualize link does work for counter field. Is this an expected behaviour for the counter fields in discover?

Gauge fields in discover:

Screenshot 2023-04-19 at 4 55 23 PM

Counter fields in discover:

Screenshot 2023-04-19 at 4 55 31 PM
stratoula commented 1 year ago

@jughosta maybe you know?

jughosta commented 1 year ago

@lalit-satapathy @stratoula

Currently the popover makes requests with the following params to show analysis view for number fields (as for nginx.stubstatus.active in the screenshot above with Top Values and Distribution tabs):

Screenshot 2023-04-19 at 14 47 44

Such request would fail for a counter field with a message: Error: Field [bytes_counter] of type [long] is not supported for aggregation [value_count] and Field [bytes_counter] of type [long] is not supported for aggregation [terms].

Depending on what analysis view is expected in the popover for counter fields, we could work on making adjustments. For now it just says "Analysis is not available for this field.".

How would you suggest to handle this case in the popover?

stratoula commented 1 year ago

Oh yeah, counter fields do not support all aggs, this is why. I think it is fine for now tbh

lalit-satapathy commented 1 year ago

When TSDB is not enabled the field was showing as below in discover:

Screenshot 2023-04-19 at 7 17 07 PM

@ruflin is the behaviour with TSDB enabled for the counter as given above an acceptable user experience?

ruflin commented 1 year ago

My focus for now is that we can use the field in our visualisations. @lalit-satapathy My understanding is, we can use the field the way we need it but the UX might not be ideal?

Overall, I think we can do much better on the error message. It took us to look at the exact query and errors that are happening to figure out why it is the way it is. Users will ask the same questions but will not be able to get the answer with the "message" today. I would like to pull in here the TSDB team @martijnvg to see what the behaviour / messaging should be for counters.

lalit-satapathy commented 1 year ago

My focus for now is that we can use the field in our visualisations. @lalit-satapathy My understanding is, we can use the field the way we need it but the UX might not be ideal?

Overall, I think we can do much better on the error message. It took us to look at the exact query and errors that are happening to figure out why it is the way it is. Users will ask the same questions but will not be able to get the answer with the "message" today. I would like to pull in here the TSDB team @martijnvg to see what the behaviour / messaging should be for counters.

I agree, wanted to avoid the confusion user will go through by seeing that message.

timductive commented 1 year ago

Does that previous visualization showing Top Values (388 at 100%) even make sense to being with? It sound like this is fine for now and eventually we will need to come up with a visualization that is more useful for counter fields.

ppisljar commented 1 year ago

@lalit-satapathy i think this is expected experience for now. To perform analysis on the field in discover we need to be able to run certain aggregations that time series counter fields do not support. You can still visualize this field however. I guess in the future we could update this with either a better message or showing some different kind of analysis for this field.

lalit-satapathy commented 1 year ago

@timductive, @ppisljar,

Thanks for the comments; we can stay with the current behaviour for now. Going forward expecting the error message is slightly different and specific for the counter fields.

martijnvg commented 1 year ago

Could in the case of counter fields a range by displayed? By looking at the query min and max aggregations are used, these aggregations do work on counter fields. Could the result of these aggs be used to display a min / max range instead?

stratoula commented 1 year ago

This sounds as a good idea to me, we just need to decide the UI but showing the max, min values makes sense. How the rest of the gang feels about it?

jughosta commented 1 year ago

Sounds good to me! Data Visualizer shows it like this:

Screenshot 2023-04-21 at 11 16 20
felixbarny commented 1 year ago

I think what would be even more useful is to show the rate of the counter across the current time range. But additionally showing the min and max sounds good, too.

jughosta commented 1 year ago

This issue is marked as "blocker". Is it because of the field popover at this point or not?

I drafted a PR which adds min/max values for counter fields https://github.com/elastic/kibana/pull/155499

Screenshot 2023-04-21 at 13 38 27

Would it be fine to deliver it (or any other UI for counter fields) in v8.9 or does it have to be in v8.8?

stratoula commented 1 year ago

Thanx for working on this Julia! From my understanding is not a blocker anymore and we can work on that on 8.9. We are very close to the FF so I am not sure if it will make it.

ruflin commented 1 year ago

[TSDB] Counter metrics not being found in any visualization

We might mix multiple things into a single issue here but counters not found in Lens IS a blocker.

stratoula commented 1 year ago

@ruflin but they appear in 8.8 right?

stratoula commented 1 year ago

@jughosta I created this issue to track the fields list enhancement https://github.com/elastic/kibana/issues/155510 This issue is used to track the 8.8 support of counter fields in Lens so let's not mix different issues together.

lalit-satapathy commented 1 year ago

Hi,

As far the counter fields supports goes for 8.8, summarising the key aspects surrounding it:

stratoula commented 1 year ago

Great, so as long as this issue tracks the visualization of the counter fields in Lens and this is possible in Lens I am closing it