quickwit-oss / quickwit-datasource

Quickwit data source for Grafana
GNU Affero General Public License v3.0
41 stars 10 forks source link

Exclude query not working #56

Closed damyan90 closed 6 months ago

damyan90 commented 7 months ago

When using using the "Filter out value in query" button in Grafana Explore view, it produces a query which doesn't work.

image

Moreover if you exclude two fields, the query fails due to the syntax, I believe: image

damyan90 commented 7 months ago

and the button I was talking about:

image

damyan90 commented 7 months ago

I think it might also be caused by the name of the fields, so here is another example that doesn't show any results, althought the data is for sure there: app.kubernetes.io/managed-by:"trivy-operator" Even when escaping characters: app\.kubernetes\.io\/managed\-by:"trivy-operator"

fmassot commented 7 months ago

thanks a lot @damyan90 for all those feedback. @ddelemeny started working on it.

damyan90 commented 7 months ago

Thanks!

I also noticed that it's impossible to get any results when using following queries: azure.workload.identity/use:"true"

image

Also when trying to escape the dots and slashes: image

Maybe I screw up the whole index config, especially when it comes to dynamic mapping: image

ddelemeny commented 7 months ago

Thank you @damyan90 for your detailed feedback, much appreciated.

There's a few things to unpack here:

Quickwit returns an error on syntax-correct queries (upstream bug)

Parsing -fieldA:term1 AND -fieldB:term2 should not fail. The plugin itself handles it correctly, but Quickwit fails to parse. image Parsing equivalent queries NOT fieldA:term1 AND NOT fieldB:term2 or (-fieldA:term1) AND (-fieldB:term2) seem to work as expected.

Plugin-side I'll add a couple "()" around the filters as a quick fix, but the parsing failure needs to be followed-up upstream.

Quickwit returns unwanted results

That's the case shown by your first screenshot and possibly by your last comment I haven't been able to reproduce the issue, it looks like the request sent by the plugin to Quickwit is ok. I could get a correct behavior from a test case with an "a.b/c"-shaped field name.

There is a seemingly unrelated linting error on the main branch, but that's not affecting releases and it doesn't prevent the request to be sent. @fmassot do you have any insight there ? Smells like something out of my reach, indexing issue maybe ?

damyan90 commented 7 months ago

Thanks for the quick check! And yes, you're right with the parsing issue. The "()" or "NOT" instead of "-" resolves it - waiting for it to be implemented in the plugin then ;)

Regarding the "a.b/c" shaped fields, do you mind sharing the doc mapping for such fields which works for you?

ddelemeny commented 7 months ago

Sure, I tried to reproduce by adding a test value in attributes.test/field. Here are the relevant parts :

{
  "field_mappings": [
    ...
    {
      "name": "attributes",
      "type": "json",
      "expand_dots": true,
      "fast": {
        "normalizer": "raw"
      },
      "indexed": true,
      "record": "basic",
      "stored": true,
      "tokenizer": "raw"
    },
    ...
  ],
  "tag_fields": [],
  "store_source": false,
  "index_field_presence": false,
  "timestamp_field": "timestamp_nanos",
  "mode": "strict",
  "max_num_partitions": 200,
  "tokenizers": []
}

Not using a dynamic mapping though, that part could be investigated further

damyan90 commented 7 months ago

Hmm this is weird. I think it only affects these few extracted fields (i.e. "app", "app.kubernetes.io/name"). I have to double check my fluentbit config cause I don't remember it being extracted. However, for instance the "annotations.x.x./restartedAt" works fine..

Green are OK, Red ones are failing.

image

Will let you know if I find something.

damyan90 commented 7 months ago

So as suspected, I'm not doing anything in the fluentbit to extract those fields. Also they're not actually extracted in qucikwit itself. That's only Grafana that shows them additionally, which is causing the whole confusion (if it is for me now when testing, then for the devs later when trying to filter by fields which don't exists in the backend - will be frustrating too)

So here's what I mean, the exact same log in both places. Grafana: image

Quickwit: image

So do you know which part is responsible for such extraction of these fields? Is it Grafana itself or the datasource? What's weird also is that there is a field "Annotations" but not "labels".

I think it would be perfect if it's 1:1 as in quickwit. Because then I can extract these fields myself (in fluentbit config) if I want and they would be queryable as well.

ddelemeny commented 7 months ago

I researched a bit more about what could be going on in your last comments.

It appears that grafana treats the labels field of a dataframe as a collection of transformation hints, specifically Extract Fields.

Not sure yet what to do about that...