jaegertracing / jaeger

CNCF Jaeger, a Distributed Tracing Platform
https://www.jaegertracing.io/
Apache License 2.0
20.6k stars 2.45k forks source link

search services failed response from /api/v3/traces #3443

Closed esnible closed 5 months ago

esnible commented 2 years ago

Describe the bug

A V3 query fails and returns JSON describing an internal server error

{
  "error": {
    "grpcCode": 2,
    "httpCode": 500,
    "message": "search services failed: elastic: Error 400 (Bad Request): An HTTP line is larger than 4096 bytes. [type=too_long_frame_exception]",
    "httpStatus": "Internal Server Error"
  }
}

To Reproduce

I loaded JSON with test data, yesterday, using the Omnition load generator.

cd ~/src/synthetic-load-generator
gtimeout 5m java -jar ./target/SyntheticLoadGenerator-1.0-SNAPSHOT-jar-with-dependencies.jar --paramsFile ./topologies/

curl "http://localhost:16686/api/traces?service=frontend" succeeds without difficultly, returning about 100 traces.

A query that should be equivalent fails: curl "localhost:16686/api/v3/traces?query.service_name=frontend&query.start_time_min=2001-01-01T00:00:00-05:00&query.start_time_max=2022-12-31T23:59:59-05:00"

Expected behavior

Version (please complete the following information):

What troubleshooting steps did you try?

I tried upgrading to the 1.28.0 image. Didn't help. Didn't try adding more logging yet.

esnible commented 2 years ago

If I try to use /api/v3/traces to get data more than 16 months the query fails with a 500.

In the example above, with a startTime of 2001 and and end time of 2022, the query asks for 8036 jaegerIndices. Testing with SPAN_STORAGE_TYPE=elasticsearch go run -tags ui ./cmd/all-in-one/main.go --log-level=DEBUG, I learned that the following a query for query.start_time_min=2022-07-15T00:00:00-05:00&query.start_time_max=2022-12-31T23:59:59-05:00 fails with an ES request for 171 indices but if I set start_time_min to 2022-07-15T00:00:00-05:00 it succeeds with 161 indices.

I considered trying to add something in validateQuery() but the exact number might vary depending on the index name.

jkowall commented 5 months ago

That load generator is deprecated now, can you reopen when you reproduce with something like this? https://opentelemetry.io/docs/demo/services/load-generator/