Closed sherifkayad closed 2 years ago
Just for the sake of completeness, I followed this community post here https://community.grafana.com/t/how-to-query-a-500mb-trace/44052/2 and I am really wondering what I am doing wrong .. As it seems to me the person who raised the question has the same issue like mine, however, he managed to solve it somehow ..
Thanks for filing an issue. I'll see if I can reproduce this based on your config.
@zalegrala amazing! Let me know if you need any further info from my side
I've been able to reproduce all but the third scenario there. In my case testing with a 75Mib trace, I would get some connection timeouts but not an OOM.
For the first two scenarios, once I reached the response larger than the max (45940141 vs 33554432)
message, raising the configured value above the requested value allowed the query to proceed. I also had to raise the query timeout because my workstation was crying.
My current config is such.
...
overrides:
max_bytes_per_trace: 5e+08
max_search_bytes_per_trace: 500000
per_tenant_override_config: /overrides/overrides.yaml
querier:
frontend_worker:
frontend_address: query-frontend-discovery.default.svc.cluster.local:9095
grpc_client_config:
max_recv_msg_size: 9e+07
max_send_msg_size: 9e+07
query_timeout: 10m
server:
grpc_server_max_recv_msg_size: 1.048576e+08
grpc_server_max_send_msg_size: 1.048576e+08
...
After chatting with one of my teammates, it sounds like this is a known issue with the way that responses are handed back through the query-frontend, in that we do several rounds of marshal/unmarshal when combining traces to respond to queries.
For your scenario three, I think raising the memory for your query-frontend may be the only answer at this point.
For the project, this is something that we'd like to address. We have a few ideas about how to reduce the number of marshal/unmarshal which should reduce the memory pressure. To help us address this, can you provide some detail about your traces? What size are the traces that Tempo is struggling with? Are these traces the standard size in your environment, or are these ones the outliers?
@zalegrala thanks for your prompt response .. below some feedback points from my side:
port-forward
to the Query Frontend and directly hit its Search by ID API to get that OOM behavior9e+07
as the max_send_msg_size
how much memory did you allocate to the Query Frontend and the Querier?Thanks for the information.
In my test, I did query the frontend directly. In my case, port 3200
is mapped to the query-frontend, so just issuing curl -s http://localhost:3200/api/traces/4fe154b38dee96fb19c30b4e2091e559
was enough for me to test.
For the limits, here is what I have set currently, which I could tweak and certainly get an OOM, since it is just running on my workstation. :)
{
"limits": {
"cpu": "4",
"memory": "8Gi"
},
"requests": {
"cpu": "100m",
"memory": "100Mi"
}
}
Checking this against one of our larger clusters, I see we don't even have limits set on the query-frontend, and in another cluster we have memory limit set to 4Gi
, and regularly see spikes pretty close to that number. Just looking at the container_memory_working_set_bytes
that we use in the mixin on the operational dashboard to get a sense.
That could certainly be the case that the memory spikes too quickly to detect. Other than execing into the pod and running ps
in a loop, I'm not sure how to know the max memory used by the process.
Let me try with a higher limit than the 5 Gigs e.g. with 8 Gigs or so and a 64 MB value for the grpc_client and get back to you.
This was enough to see the memory spike within the pod: while true; do ps -o rss,vsz; sleep .01; done
out of curiosity, would it help to run more queriers? e.g. at the moment we are running 3 Queriers with some beefy resources. Would it help to scale these horizontally? .. Also worth mentioning we are running 2 Query Frontends.
Just confirming with my team, but I don't think there would be an impact here with the number of queriers. Each querier will require a little more memory on the frontend for tracking purposes, but not enough that I would expect it to impact your situation.
@zalegrala I kept pushing the memory limits step by step till 16 Gigs .. then the query frontend consumption was something like:
RSS VSZ
7.5g 15g
8 1600
1036 1668
8 1592
Unfortunately the query didn't run .. it kep running almost infinitely and no results were retrieved. One other funny behavior I noticed was that even if you specify a time range where that trace shouldn't be there, the query frontend and the querier kept on working to get it .. I could see some logs in the Querier like:
level=info ts=2022-06-17T08:01:28.67472812Z caller=tempodb.go:348 org_id=single-tenant msg="searching for trace in block" findTraceID=0000000000000000ea233ba0fabfdf58 block=20a6b62c-0722-4816-9e14-b5ff29b5597a found=true
level=info ts=2022-06-17T08:01:28.675180513Z caller=tempodb.go:348 org_id=single-tenant msg="searching for trace in block" findTraceID=0000000000000000ea233ba0fabfdf58 block=839b2610-8862-45d6-b865-6b8edae13213 found=true
level=info ts=2022-06-17T08:01:28.928832925Z caller=tempodb.go:348 org_id=single-tenant msg="searching for trace in block" findTraceID=0000000000000000ea233ba0fabfdf58 block=1ec76d5e-dfbc-4ce2-827b-02efd37ec839 found=true
level=info ts=2022-06-17T08:01:28.983337825Z caller=tempodb.go:348 org_id=single-tenant msg="searching for trace in block" findTraceID=0000000000000000ea233ba0fabfdf58 block=42be281d-7c84-4b08-960f-ad3114bc88f2 found=true
level=info ts=2022-06-17T08:01:28.995746926Z caller=tempodb.go:348 org_id=single-tenant msg="searching for trace in block" findTraceID=0000000000000000ea233ba0fabfdf58 block=88a6ca41-d320-4970-a076-d4f547359dfe found=true
level=info ts=2022-06-17T08:01:29.009640648Z caller=tempodb.go:348 org_id=single-tenant msg="searching for trace in block" findTraceID=0000000000000000ea233ba0fabfdf58 block=a75c4d73-502c-4585-92a6-a7e1e1ad6847 found=true
level=info ts=2022-06-17T08:01:29.073388561Z caller=tempodb.go:348 org_id=single-tenant msg="searching for trace in block" findTraceID=0000000000000000ea233ba0fabfdf58 block=2d4117f0-a71c-4b93-8dd0-1b9382b672f0 found=true
level=info ts=2022-06-17T08:01:30.1867258Z caller=tempodb.go:348 org_id=single-tenant msg="searching for trace in block" findTraceID=0000000000000000ea233ba0fabfdf58 block=a32c7612-b477-4aed-9dcf-642dd247ee7f found=true
level=info ts=2022-06-17T08:01:30.192600175Z caller=tempodb.go:348 org_id=single-tenant msg="searching for trace in block" findTraceID=0000000000000000ea233ba0fabfdf58 block=a40cca20-64d2-450b-a308-ab40688af578 found=true
level=info ts=2022-06-17T08:01:30.231101367Z caller=tempodb.go:348 org_id=single-tenant msg="searching for trace in block" findTraceID=0000000000000000ea233ba0fabfdf58 block=881fc0ab-2374-4257-b5c3-439da49b00a9 found=true
level=info ts=2022-06-17T08:01:30.32067071Z caller=tempodb.go:348 org_id=single-tenant msg="searching for trace in block" findTraceID=0000000000000000ea233ba0fabfdf58 block=86ae1e37-0508-4e30-99e8-e2db2d96b65f found=true
level=info ts=2022-06-17T08:01:29.083943289Z caller=tempodb.go:348 org_id=single-tenant msg="searching for trace in block" findTraceID=0000000000000000ea233ba0fabfdf58 block=887e3942-e2ac-4222-be8e-bf2516a8621d found=true
level=info ts=2022-06-17T08:01:29.105556398Z caller=tempodb.go:348 org_id=single-tenant msg="searching for trace in block" findTraceID=0000000000000000ea233ba0fabfdf58 block=4b4763c2-a3b3-4b83-946d-f8dc2aee5d1e found=true
level=info ts=2022-06-17T08:01:29.185914154Z caller=tempodb.go:348 org_id=single-tenant msg="searching for trace in block" findTraceID=0000000000000000ea233ba0fabfdf58 block=89d373ec-654c-44a7-820a-2e92116cbe96 found=true
level=info ts=2022-06-17T08:01:29.286247542Z caller=tempodb.go:348 org_id=single-tenant msg="searching for trace in block" findTraceID=0000000000000000ea233ba0fabfdf58 block=2826aa37-4ebb-452f-a4c6-e07bfacf146f found=true
level=info ts=2022-06-17T08:01:29.288657905Z caller=tempodb.go:348 org_id=single-tenant msg="searching for trace in block" findTraceID=0000000000000000ea233ba0fabfdf58 block=9d3d3e0e-8918-4361-a77e-a1c1551238a7 found=true
level=info ts=2022-06-17T08:01:29.306442267Z caller=tempodb.go:348 org_id=single-tenant msg="searching for trace in block" findTraceID=0000000000000000ea233ba0fabfdf58 block=84a25621-aa0f-4276-8528-5cce91aff0e3 found=true
level=info ts=2022-06-17T08:01:29.309636864Z caller=tempodb.go:348 org_id=single-tenant msg="searching for trace in block" findTraceID=0000000000000000ea233ba0fabfdf58 block=280110cc-05f8-4fbb-8d08-d73ff07036fb found=true
level=info ts=2022-06-17T08:01:29.399623259Z caller=tempodb.go:348 org_id=single-tenant msg="searching for trace in block" findTraceID=0000000000000000ea233ba0fabfdf58 block=9f61d693-5557-4cf0-820f-9bac43b2cebc found=true
level=info ts=2022-06-17T08:01:29.426682792Z caller=tempodb.go:348 org_id=single-tenant msg="searching for trace in block" findTraceID=0000000000000000ea233ba0fabfdf58 block=1de57273-bfb0-4899-8925-b4a79ecd6e54 found=true
level=info ts=2022-06-17T08:01:29.482513792Z caller=tempodb.go:348 org_id=single-tenant msg="searching for trace in block" findTraceID=0000000000000000ea233ba0fabfdf58 block=2bc16b06-5fd0-4b0f-ae8c-5906c758414a found=true
level=info ts=2022-06-17T08:01:29.496012667Z caller=tempodb.go:348 org_id=single-tenant msg="searching for trace in block" findTraceID=0000000000000000ea233ba0fabfdf58 block=a69296f6-70fb-42df-b8fd-bff1ad11e31c found=true
level=info ts=2022-06-17T08:01:29.572987836Z caller=tempodb.go:348 org_id=single-tenant msg="searching for trace in block" findTraceID=0000000000000000ea233ba0fabfdf58 block=28dfd3d5-b84c-4506-adda-4e57096877e7 found=true
level=info ts=2022-06-17T08:01:29.597361773Z caller=tempodb.go:348 org_id=single-tenant msg="searching for trace in block" findTraceID=0000000000000000ea233ba0fabfdf58 block=a0610d3f-7a55-427d-956b-eb55e09cd200 found=true
These logs went on almost forever .. I really do believe something is fishy with the search behavior of that exact trace and similar traces in size. I have no idea what's wrong though ..
This issue has been automatically marked as stale because it has not had any activity in the past 60 days. The next time this stale check runs, the stale label will be removed if there is new activity. The issue will be closed after 15 days if there is no new activity. Please apply keepalive label to exempt this Issue.
Describe the bug
Adding a huge trace to Tempo isn't a problem thanks to the
grpc_server_max_recv_msg_size
andgrpc_server_max_send_msg_size
that we are setting on the server to 100MB (104857600
)However, retrieving these (by ID e.g. from Grafana) causes really a weird behavior as follows:
In scenario one, we leave the
grpc_client_config
of the frontend worker to the default values => we get an error as follows:failed to get trace with id: ea233ba0fabfdf58 Status: 500 Internal Server Error Body: response larger than the max (45940141 vs 16777216)
In scenario two, we modify the
grpc_client_config
of the frontend worker as follows:In that case we get a similar error message
failed to get trace with id: ea233ba0fabfdf58 Status: 500 Internal Server Error Body: response larger than the max (45940141 vs 33554432)
In scenario three, we go further and attempt to increase the
max_recv_msg_size
andmax_send_msg_size
of thegrpc_client_config
of the frontend worker to 64MB (67108864
) and in that case the Query Frontend dies on us with an OOM error. Nevertheless, the memory usage looks fairly fine and we anyways granted the container round about 5 Gigs of RAM.To Reproduce Steps to reproduce the behavior: Try one of the scenarios above with Tempo version
1.4.1
in microservices mode.Expected behavior
The retrieval of the trace by ID should work fine (especially when the
grpc_client_config
is set to a high value)Environment:
Additional Context
My current config for Tempo: