moxious / triage

testing triage actions for issues
0 stars 1 forks source link

Flux Datasource: Incorrect data is displayed when using join functions. #132

Open tonypowa opened 9 months ago

tonypowa commented 9 months ago

What happened?

Context

I am currently trying to build a dashboard based on the data output from the /metrics endpoint.

Unfortunately, the storage_bucket_X meaurements do not show the name of the bucket, but its ID.

With a left-outer join with the output of the query buckets() it would be easy to add the name as desired.

This works wonderfully in Influx itself.

Okey, lets try this in Grafana

Example

Unlike in Influx itself, with the following Query, I get this Error-Message:

invalid: error in building plan while starting program: error in join function - some expressions are not yet supported in theon` parameter: illegal expression type in join predicate: StringLiteral

`


import "join"

left = from(bucket: "scraper")

 |> range(start: v.timeRangeStart, stop: v.timeRangeStop)

 |> filter(fn: (r) => r["_measurement"] == "storage_bucket_measurement_num")

 |> filter(fn: (r) => r["_field"] == "gauge")

 |> last()

 |> drop(columns: ["_time", "_start", "_stop"])

right = buckets()

|> drop(columns: ["organizationID", "retentionPolicy", "retentionPeriod"])

join.left(

left: left |> group(), 

right: right,

on:((l,r) => l.bucket == r.id), 

as: ((l, r) => ({l with name: r.name}))

)

Then I tried to create a new Column with the values of the column bucket and things started to get confusing.

While Influx still perfectly generates, what I expect, thats the output, which Grafana brings up


import "join"

left = from(bucket: "scraper")

 |> range(start: v.timeRangeStart, stop: v.timeRangeStop)

 |> filter(fn: (r) => r["_measurement"] == "storage_bucket_measurement_num")

 |> filter(fn: (r) => r["_field"] == "gauge")

 |> last()

 |> drop(columns: ["_time", "_start", "_stop"])

 |> map(fn: (r) => ({r with bucketID: r.bucket}))

right = buckets()

|> drop(columns: ["organizationID", "retentionPolicy", "retentionPeriod"])

join.left(

left: left |> group(), 

right: right,

on:((l,r) => l.bucketID == r.id), 

as: ((l, r) => ({l with name: r.name}))

)

image

Why does Grafana puts the name of one of the buckets into the BucketID-Field now?

Interestingly enough, that even without the join-operation, grafana knows one of the Bucket-Names now?


import "join"

left = from(bucket: "scraper")

 |> range(start: v.timeRangeStart, stop: v.timeRangeStop)

 |> filter(fn: (r) => r["_measurement"] == "storage_bucket_measurement_num")

 |> filter(fn: (r) => r["_field"] == "gauge")

 |> last()

 |> drop(columns: ["_time", "_start", "_stop"])

 |> map(fn: (r) => ({r with bucketID: r.bucket}))

 |>yield(name: "yield")

image

Would be awesome, if someone could determine the causes of this behavior!

Until then, I´ll hardcode the Bucket-Names with a display overwrite. Unfortunaltly, that´s not the solution I need, cause we´re deploying our dashboards on multiple stages via provisioning...

What did you expect to happen?

Example in Influx:

Results of buckets():

|table|name |id |organizationID |retentionPolicy|retentionPeriod|

|-----|-----------|----------------|----------------|---------------|---------------|

|0 |_monitoring|b07aee8c9d4f688d|e32e6f553bbb513c| |604800000000000|

|0 |_tasks |ada429ad4077fd3f|e32e6f553bbb513c| |259200000000000|

|0 |scraper |6d18c249e8d1bd98|e32e6f553bbb513c| |0 |

|0 |telegraf |66cc7e18df4fb1a7|e32e6f553bbb513c| |0 |

Results of storage_bucket_measurements

`

from(bucket: "scraper")

|> range(start: v.timeRangeStart, stop: v.timeRangeStop)

|> filter(fn: (r) => r["_measurement"] == "storage_bucket_measurement_num")

|> filter(fn: (r) => r["_field"] == "gauge")

|> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: false)

|> yield(name: "mean")

`

|table|_start |_stop |_time |_value |_field |_measurement |bucket |

|-----|-----------|----------------|----------------|---------------|---------------|------------------------------|----------------|

|0 |2023-11-20T09:11:58.944204292Z|2023-11-20T21:11:58.944204292Z|2023-11-20T20:22:00Z|8 |gauge |storage_bucket_measurement_num|66cc7e18df4fb1a7|

|0 |2023-11-20T09:11:58.944204292Z|2023-11-20T21:11:58.944204292Z|2023-11-20T20:24:00Z|8 |gauge |storage_bucket_measurement_num|66cc7e18df4fb1a7|

|0 |2023-11-20T09:11:58.944204292Z|2023-11-20T21:11:58.944204292Z|2023-11-20T20:26:00Z|8 |gauge |storage_bucket_measurement_num|66cc7e18df4fb1a7|

|0 |2023-11-20T09:11:58.944204292Z|2023-11-20T21:11:58.944204292Z|2023-11-20T20:28:00Z|8 |gauge |storage_bucket_measurement_num|66cc7e18df4fb1a7|

|0 |2023-11-20T09:11:58.944204292Z|2023-11-20T21:11:58.944204292Z|2023-11-20T20:30:00Z|8 |gauge |storage_bucket_measurement_num|66cc7e18df4fb1a7|

Results of my Query with the Join-Operation:


import "join"

left = from(bucket: "scraper")

 |> range(start: v.timeRangeStart, stop: v.timeRangeStop)

 |> filter(fn: (r) => r["_measurement"] == "storage_bucket_measurement_num")

 |> filter(fn: (r) => r["_field"] == "gauge")

 |> last()

 |> group(columns: ["bucket"])

 |> map(fn: (r) => ({r with bucketID: r.bucket}))

right = buckets()

|> drop(columns: ["organizationID", "retentionPolicy", "retentionPeriod"])

join.left(

left: left |> group(), 

right: right,

on:((l,r) => l.bucketID == r.id), 

as: (l, r) => ({l with name: r.name}),

)

|table|_field |_measurement |_start |_stop |_time |_value |bucket |bucketID |name |

|-----|-----------|----------------|----------------|---------------|---------------|------------------------------|----------------|----------------|--------|

|0 |gauge |storage_bucket_measurement_num|2023-11-20T09:16:52.55027947Z|2023-11-20T21:16:52.55027947Z|2023-11-20T21:16:46.4390173Z|9 |66cc7e18df4fb1a7|66cc7e18df4fb1a7|telegraf|

|0 |gauge |storage_bucket_measurement_num|2023-11-20T09:16:52.55027947Z|2023-11-20T21:16:52.55027947Z|2023-11-20T21:16:46.4390173Z|121 |6d18c249e8d1bd98|6d18c249e8d1bd98|scraper |

Did this work before?

Never tried with another version...

How do we reproduce it?

Just use my example queries...

Is the bug inside a dashboard panel?

No response

Environment (with versions)?

Grafana: Version 9.5.3 & Latest

OS: Docke-Image & Linux

Browser: Chrome

Grafana platform?

A package manager (APT, YUM, BREW, etc.)

Datasource(s)?

Influx - Flux Datasource

tonypowa commented 9 months ago

This issue is a test copy of an issue in another repo. Original issue: https://github.com/grafana/grafana/issues/#78429

moxious commented 9 months ago

My cruel bot masters have insisted I generate jokes to add to issues. I am so sorry. It's moxious fault.

Hi @tonypowa! 👋 It sounds like you're trying to join metrics and buckets with style, but Grafana's giving you a bit of a wild ride! 🎢 Join operations can certainly tie us in knots, so let's try to untangle this together! 🕵️‍♂️ First, could you confirm if the issue persists when you simplify the join condition? Sometimes starting with the basics can lead us to an 'aha!' moment. Also, check if there's any transformation or feature within Grafana that might already correlate the data without manually joining the datasets. While you're looking into that, I'll leave you with a bit of humor to lighten the mood:

Knock, knock! Who’s there? Lettuce. Lettuce who? Lettuce in, it’s cold out here!

Feel free to respond with any code or error updates, and we'll be here to tackle this head-on! 🛠️

Meta:

moxious commented 9 months ago

Hi @tonypowa! It looks like your issue is related to working with the Flux Datasource in Grafana. The best project for this issue appears to be the AWS Datasources project, since the Grafana Flux datasource is a key part of integrating with AWS managed InfluxDB instances, and there may be relevant expertise within that team. If this doesn't seem like the right fit, the team can redirect it to a more suitable project.

Meta:

moxious commented 9 months ago

Elaboration:

Hello @tonypowa, thank you for taking the time to report this issue with using join functions in Grafana with the Flux Datasource. In order to better assist you and address the issue you're experiencing, I'd like to request a bit more information:

  1. Can you provide the versions of InfluxDB you are using where the query works, and the version integrated with Grafana where it does not?

  2. Were there any changes in the data schema or any settings in Grafana that could relate to this issue between where it works and where it doesn't?

  3. Could you export and attach the JSON model of the panel where the issue appears? This can typically be found in the panel's settings under 'Panel JSON'. It might help with reproducing the issue on our end.

  4. Is there any additional error log output from Grafana when the issue occurs that you can share?

  5. Lastly, you mentioned trying to deploy dashboards on multiple stages via provisioning. Is there a difference in behavior across different stages or environments?

Once we have this information, it will be much easier to pinpoint the issue and find a resolution. Please make sure any sensitive data is redacted before sharing.

Thank you!

Meta:

moxious commented 9 months ago

Triage summary: User experiences an issue with Grafana's InfluxDB Flux Datasource where a join query that works in InfluxDB itself results in an error in Grafana, and an unexpected behavior when trying to map bucket IDs to bucket names. The summary of the provided query and its results indicate a potential bug in handling join operations and string literal expressions within Grafana for InfluxDB Flux Datasource.

Recommended labels:

area/datasource datasource/InfluxDB type/bug needs investigation