grafana / tempo

Grafana Tempo is a high volume, minimal dependency distributed tracing backend.
https://grafana.com/oss/tempo/
GNU Affero General Public License v3.0
4.04k stars 524 forks source link

Search query TraceQL with more than 1024 characters fails #4321

Open juliomeinerz-atscale opened 1 week ago

juliomeinerz-atscale commented 1 week ago

Describe the bug

If you pass a query string with more than 1024 characters on /api/search endpoint, Tempo throws an error:

invalid TraceQL query: parse error at line 1, col 1025: syntax error: unexpected $end, expecting ) or ,

To Reproduce Steps to reproduce the behavior:

  1. Start Tempo (SHA or version)
  2. Send a request with query longer than 1024 characters
  3. The error is returned

Expected behavior Query works without errors.

Environment:

Additional Context

Image

This is probably a duplicate from https://github.com/grafana/tempo/issues/3747 but since it was not resolved there, I am creating this one.

joe-elliott commented 1 week ago

Can you share the query that threw this error?

This is extremely easy to test if anyone wants to try fixing it or just do some investigation. Just add a test like this to pkg/traceql:

func TestLongQuery(t *testing.T) {
    _, err := Parse("<some super long query>")
    require.NoError(t, err)
}

and slap the giant query in there. You can step debug through this and see exactly how it breaks.

juliomeinerz-atscale commented 1 week ago

@joe-elliott this is the query I am using:

{ rootServiceName = "engine" && span.fakeResults = "false" && (span.queryType = "System" || span.queryType = "User") && (span.modelId = "" || span.modelId = "c04e7351-c112-5b43-9e1c-ef65c9a363e4" || span.modelId = "80aeee12-d504-5a3e-835e-4e9eb3e0ee57" || span.modelId = "0c394e85-f252-54fa-8fca-daadbf26bb60" || span.modelId = "a579ba92-ee7b-572c-84d2-a191913d2aa1" || span.modelId = "477d9fb6-e0ae-5a59-a568-3ad6eceb851e" || span.modelId = "8dca2f5e-ad3e-5bae-bba0-33dfbd06c10d" || span.modelId = "3432bcac-3d27-561b-b585-f59671affbc0" || span.modelId = "920f98c4-f867-55ff-8f85-15a31a070037" || span.modelId = "26ee11b5-12ee-5c21-8be9-a6ed8113e156" || span.modelId = "c4fab764-b2e9-5d31-ab13-97e1345fb373" || span.modelId = "74aff45f-34b9-516f-a0ec-63ad61e329b4") && (status = error || status = unset || status = ok) } | select(status, name, span.catalogId, span.modelId, span.queryType, span.userId, event.queryAggregateDatasets, event.queryPartAggregateDatasets, event.usedLocalCache, event.attributes, event.measurements, event.message, event.name)

yzmp commented 1 week ago

Got the same error here. It looks like the presence of the select clause with event attributes is causing the error.

Removing the event attributes from the select clause returns a valid response (even after adding more filters to ensure more than 1024 chars). This aligns with my observation that big queries worked fine on v2.5.

Selecting event attributes is a new thing in v2.6, right?

yzmp commented 1 week ago

Actually, no. If I test with @juliomeinerz-atscale's query I get the same error. If I add at least one more span.modelId filter, it works.

His query has 1070 chars. The updated one has 1127.

Testing with arbitrary values, but keeping his query structure (same attributes filtered and selected), it seems like I get the same error when the query is between 1052 and 1119 characters long.

joe-elliott commented 1 week ago

Selecting event attributes is a new thing in v2.6, right?

Yes

when the query is between 1052 and 1119 characters long.

This is quite strange :)