Open Hexcles opened 11 months ago
Some debug logging:
{"timestamp":"[ 1269.373381s]","level":"DEBUG","fields":{"message":"client connection open"},"target":"linkerd_transport_metrics::client","spans":[{"name":"inbound"},{"port":80,"name":"server"},{"name":"backend-web.default.svc.cluster.local:80","name":"http"},{"name":"profile"},{"name":"http1"}],"threadId":"ThreadId(1)"}
{"timestamp":"[ 1269.375560s]","level":"DEBUG","fields":{"state":"Some(State { classify: Grpc(Codes({2, 4, 7, 13, 14, 15})), tx: Sender { chan: Tx { inner: Chan { tx: Tx { block_tail: 0x7f1dd886c700, tail_position: 0 }, semaphore: Semaphore { semaphore: Semaphore { permits: 10000 }, bound: 10000 }, rx_waker: AtomicWaker, tx_count: 2, rx_fields: \"...\" } } } })"},"target":"linkerd_proxy_http::classify::channel","spans":[{"name":"outbound"},{"client.addr":"172.17.75.208:58594","server.addr":"10.100.169.20:80","name":"accept"},{"addr":"10.100.169.20:80","name":"proxy"},{"name":"http"},{"name":"sessions-web","ns":"default","port":"80","name":"service"},{"addr":"172.17.80.145:8080","name":"endpoint"}],"threadId":"ThreadId(1)"}
{"timestamp":"[ 1269.375597s]","level":"DEBUG","fields":{"method":"POST","uri":"http://sessions-web/com.session.Sessions/WhoisByCookie","version":"HTTP/2.0"},"target":"linkerd_proxy_http::client","spans":[{"name":"outbound"},{"client.addr":"172.17.75.208:58594","server.addr":"10.100.169.20:80","name":"accept"},{"addr":"10.100.169.20:80","name":"proxy"},{"name":"http"},{"name":"sessions-web","ns":"default","port":"80","name":"service"},{"addr":"172.17.80.145:8080","name":"endpoint"},{"name":"h2"}],"threadId":"ThreadId(1)"}
{"timestamp":"[ 1269.375605s]","level":"DEBUG","fields":{"headers":"{\"te\": \"trailers\", \"grpc-trace-bin\": \"\", \"grpc-accept-encoding\": \"gzip\", \"grpc-encoding\": \"gzip\", \"x-datadog-trace-id\": \"4009838577945735206\", \"x-datadog-parent-id\": \"6986014011649582376\", \"x-datadog-sampling-priority\": \"-1\", \"x-datadog-tags\": \"_dd.p.dm=-3\", \"traceparent\": \"00-000000000000000037a5cef10d1cf026-60f34ebae8466528-00\", \"tracestate\": \"dd=t.dm:-3\", \"rop\": \"803a8303e5668f0e058c2080c10c222d\", \"ropt\": \"http.handler\", \"pop\": \"803a8303e5668f0e058c2080c10c222d\", \"popt\": \"http.handler\", \"grpc-timeout\": \"29999m\", \"x-if-wsat\": \"<1KB of secrets>\", \"user-agent\": \"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36\", \"content-type\": \"application/grpc\", \"accept-encoding\": \"gzip\", \"l5d-dst-canonical\": \"sessions-web.default.svc.cluster.local:80\"}"},"target":"linkerd_proxy_http::client","spans":[{"name":"outbound"},{"client.addr":"172.17.75.208:58594","server.addr":"10.100.169.20:80","name":"accept"},{"addr":"10.100.169.20:80","name":"proxy"},{"name":"http"},{"name":"sessions-web","ns":"default","port":"80","name":"service"},{"addr":"172.17.80.145:8080","name":"endpoint"},{"name":"h2"}],"threadId":"ThreadId(1)"}
{"timestamp":"[ 1269.385088s]","level":"DEBUG","fields":{"message":"Remote proxy error"},"target":"linkerd_app_outbound::http::handle_proxy_error_headers","spans":[{"name":"outbound"},{"client.addr":"172.17.75.208:58594","server.addr":"10.100.169.20:80","name":"accept"},{"addr":"10.100.169.20:80","name":"proxy"},{"name":"http"},{"name":"sessions-web","ns":"default","port":"80","name":"service"},{"addr":"172.17.80.145:8080","name":"endpoint"}],"threadId":"ThreadId(1)"}
thread 'main' panicked at 'if our `state` was `None`, the shared state must be `Some`', /__w/linkerd2-proxy/linkerd2-proxy/linkerd/http-retry/src/replay.rs:152:22
{"timestamp":"[ 1269.385179s]","level":"DEBUG","fields":{"message":"dropping ResponseBody"},"target":"linkerd_proxy_http::classify::channel","spans":[{"name":"outbound"},{"client.addr":"172.17.75.208:58594","server.addr":"10.100.169.20:80","name":"accept"},{"addr":"10.100.169.20:80","name":"proxy"},{"name":"http"}],"threadId":"ThreadId(1)"}
{"timestamp":"[ 1269.385191s]","level":"DEBUG","fields":{"message":"sending EOS to classify"},"target":"linkerd_proxy_http::classify::channel","spans":[{"name":"outbound"},{"client.addr":"172.17.75.208:58594","server.addr":"10.100.169.20:80","name":"accept"},{"addr":"10.100.169.20:80","name":"proxy"},{"name":"http"}],"threadId":"ThreadId(1)"}
{"timestamp":"[ 1269.385631s]","level":"DEBUG","fields":{"message":"The client is shutting down the connection","res":"Ok(())"},"target":"linkerd_proxy_http::server","spans":[{"name":"outbound"},{"client.addr":"172.17.75.208:58594","server.addr":"10.100.169.20:80","name":"accept"},{"addr":"10.100.169.20:80","name":"proxy"},{"name":"http"}],"threadId":"ThreadId(1)"}
{"timestamp":"[ 1269.385671s]","level":"DEBUG","fields":{"message":"Connection closed"},"target":"linkerd_app_core::serve","spans":[{"name":"outbound"},{"client.addr":"172.17.75.208:58594","server.addr":"10.100.169.20:80","name":"accept"}],"threadId":"ThreadId(1)"}
@hawkw do you know how to enable RUST_BACKTRACE
in linkerd-proxy?
@Hexcles were you able to create a repro for this?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.
Still happening. The panic site has been moved though:
Working on a repro
OK here's my complete repro:
https://github.com/Hexcles/wire/blob/grpc-sample/samples/wire-grpc-sample/k8s.yaml
kubectl apply -f k8s.yaml
client
pod: you'll soon see a panic (within a minute)I notice that your proto is:
service Whiteboard {
rpc Whiteboard (stream WhiteboardCommand) returns (stream WhiteboardUpdate) {
}
rpc Echo (Point) returns (Point) {
}
}
Are you exercising both RPCs in this scenario?
Nope, only the Echo
. I didn't test the streaming version actually. I added the unary call for a simpler repro.
So here's the server-side code exercised:
And client-side code:
Note that both sides use wire-grpc, not upstream grpc-java from Google. They are supposedly compatible on the wire, but apparently there's something unique with the frames produced by wire-grpc (otherwise, you'd have a lot of bug reports from grpc users already).
Thanks. This repro will be enough for us to track this down.
We're currently working on some other retry improvements (that will also address #12826). The good news is that I've tried your repro against the branch of new work. We're going to prioritize making the new functionality available on an edge release; but we'll follow up to ensure this underlying issue is eliminated.
The good news is that I've tried your repro against the branch of new work.
Do you mean you can reproduce the panic on stable, and the WIP feature in edge no longer exhibits the panic? That's great news!
Ah, yeah. The WIP fixes the issue.
I believe it's caused by inconsistent framing emitted by wire-grpc...
A typical stream loooks like:
[ http:Connection{peer=Server}: h2::codec::framed_read: received frame=Data { stream_id: StreamId(3) }
[ http:Connection{peer=Server}: h2::codec::framed_read: received frame=Data { stream_id: StreamId(3), flags: (0x1: END_STREAM) }
[ service{ns=default name=server port=80}:pool:endpoint{addr=10.42.0.80:8080}:http.endpoint:h2:Connection{peer=Client}: h2::codec::framed_write: send frame=Data { stream_id: StreamId(1) }
[ service{ns=default name=server port=80}:pool:endpoint{addr=10.42.0.80:8080}:http.endpoint:h2:Connection{peer=Client}: h2::codec::framed_write: send frame=Data { stream_id: StreamId(1), flags: (0x1: END_STREAM) }
[ service{ns=default name=server port=80}:pool:endpoint{addr=10.42.0.80:8080}:http.endpoint:h2:Connection{peer=Client}: h2::codec::framed_read: received frame=Headers { stream_id: StreamId(1), flags: (0x4: END_HEADERS) }
[ service{ns=default name=server port=80}:pool:endpoint{addr=10.42.0.80:8080}:http.endpoint:h2:Connection{peer=Client}: h2::codec::framed_read: received frame=Data { stream_id: StreamId(1) }
Importantly, there is a data frame with an END_STREAM flag.
On the second request, however, no such END_STREAM is set:
[ http:Connection{peer=Server}: h2::codec::framed_write: send frame=Headers { stream_id: StreamId(3), flags: (0x4: END_HEADERS) }
[ http:Connection{peer=Server}: h2::codec::framed_write: send frame=Data { stream_id: StreamId(3) }
[ service{ns=default name=server port=80}:pool:endpoint{addr=10.42.0.80:8080}:http.endpoint:h2:Connection{peer=Client}: h2::codec::framed_read: received frame=Headers { stream_id: StreamId(1), flags: (0x5: END_HEADERS | END_STREAM) }
[ http: linkerd_proxy_http::classify::channel: dropping ResponseBody
[ http:Connection{peer=Server}: h2::codec::framed_write: send frame=Headers { stream_id: StreamId(3), flags: (0x5: END_HEADERS | END_STREAM) }
[ http:Connection{peer=Server}: h2::codec::framed_read: received frame=Headers { stream_id: StreamId(5), flags: (0x4: END_HEADERS) }
[ http:Connection{peer=Server}: h2::codec::framed_read: received frame=Data { stream_id: StreamId(5) }
[ service{ns=default name=server port=80}:pool:endpoint{addr=10.42.0.80:8080}: linkerd_proxy_http::classify::channel: state=Some(State { classify: Grpc(Codes({2, 4, 7, 13, 14, 15})), tx: Sender { chan: Tx { inner: Chan { tx: Tx { block_tail: 0x7f4d96031e00, tail_position: 0 }, semaphore: Semaphore { semaphore: Semaphore { permits: 10000 }, bound: 10000 }, rx_waker: AtomicWaker, tx_count: 2, rx_fields: "..." } } } })
[ service{ns=default name=server port=80}:pool:endpoint{addr=10.42.0.80:8080}:http.endpoint: linkerd_proxy_http::client: method=POST uri=http://server/com.squareup.wire.whiteboard.Whiteboard/Echo version=HTTP/2.0
[ service{ns=default name=server port=80}:pool:endpoint{addr=10.42.0.80:8080}:http.endpoint:h2:Connection{peer=Client}: h2::codec::framed_write: send frame=Headers { stream_id: StreamId(3), flags: (0x4: END_HEADERS) }
[ service{ns=default name=server port=80}:pool:endpoint{addr=10.42.0.80:8080}:http.endpoint:h2:Connection{peer=Client}: h2::codec::framed_write: send frame=Data { stream_id: StreamId(3) }
[ service{ns=default name=server port=80}:pool:endpoint{addr=10.42.0.80:8080}:http.endpoint:h2:Connection{peer=Client}: h2::codec::framed_read: received frame=Headers { stream_id: StreamId(3), flags: (0x4: END_HEADERS) }
[ service{ns=default name=server port=80}:pool:endpoint{addr=10.42.0.80:8080}:http.endpoint:h2:Connection{peer=Client}: h2::codec::framed_read: received frame=Data { stream_id: StreamId(3) }
When the server responds before the request stream has completed, it appears to put the retry middleware into a bad state... But this is valid at the protocol level and in any case we should never crash here...
We'll update the issue when something is available to test on edge.
edge-24.7.5 includes support for GRPCRoute resource annotations that enable timeout and retry configurations. We'll be working on more official documentation, but I wanted to share a quick demo of how to use these new configs. I've udpated the wire-grpc example manifets with a route configuration like:
---
kind: GRPCRoute
apiVersion: gateway.networking.k8s.io/v1alpha2
metadata:
name: whiteboard-echo
annotations:
retry.linkerd.io/grpc: internal
retry.linkerd.io/limit: "2"
retry.linkerd.io/timeout: 150ms
timeout.linkerd.io/request: 1s
spec:
parentRefs:
- name: whiteboard
kind: Service
group: core
rules:
- matches:
- method:
type: Exact
service: com.squareup.wire.whiteboard.Whiteboard
method: Echo
...
The retry.linkerd.io/grpc
annotation can be used to configure a list of status codes:
metadata:
annotations:
retry.linkerd.io/grpc: cancelled,deadline-exceeded,internal,resource-exhausted,unavailable
While the demo app doesn't actually trigger timeouts or retries, we are able to observe gRPC-status aware route metrics:
# HELP outbound_grpc_route_request_duration_seconds The time between request initialization and response completion.
# TYPE outbound_grpc_route_request_duration_seconds histogram
# UNIT outbound_grpc_route_request_duration_seconds seconds
outbound_grpc_route_request_duration_seconds_sum{parent_group="core",parent_kind="Service",parent_namespace="default",parent_name="whiteboard",parent_port="80",parent_section_name="",route_group="gateway.networking.k8s.io",route_kind="GRPCRoute",route_namespace="default",route_name="whiteboard-echo"} 2.269708098
outbound_grpc_route_request_duration_seconds_count{parent_group="core",parent_kind="Service",parent_namespace="default",parent_name="whiteboard",parent_port="80",parent_section_name="",route_group="gateway.networking.k8s.io",route_kind="GRPCRoute",route_namespace="default",route_name="whiteboard-echo"} 197
outbound_grpc_route_request_duration_seconds_bucket{le="0.05",parent_group="core",parent_kind="Service",parent_namespace="default",parent_name="whiteboard",parent_port="80",parent_section_name="",route_group="gateway.networking.k8s.io",route_kind="GRPCRoute",route_namespace="default",route_name="whiteboard-echo"} 197
outbound_grpc_route_request_duration_seconds_bucket{le="0.5",parent_group="core",parent_kind="Service",parent_namespace="default",parent_name="whiteboard",parent_port="80",parent_section_name="",route_group="gateway.networking.k8s.io",route_kind="GRPCRoute",route_namespace="default",route_name="whiteboard-echo"} 197
outbound_grpc_route_request_duration_seconds_bucket{le="1.0",parent_group="core",parent_kind="Service",parent_namespace="default",parent_name="whiteboard",parent_port="80",parent_section_name="",route_group="gateway.networking.k8s.io",route_kind="GRPCRoute",route_namespace="default",route_name="whiteboard-echo"} 197
outbound_grpc_route_request_duration_seconds_bucket{le="10.0",parent_group="core",parent_kind="Service",parent_namespace="default",parent_name="whiteboard",parent_port="80",parent_section_name="",route_group="gateway.networking.k8s.io",route_kind="GRPCRoute",route_namespace="default",route_name="whiteboard-echo"} 197
outbound_grpc_route_request_duration_seconds_bucket{le="+Inf",parent_group="core",parent_kind="Service",parent_namespace="default",parent_name="whiteboard",parent_port="80",parent_section_name="",route_group="gateway.networking.k8s.io",route_kind="GRPCRoute",route_namespace="default",route_name="whiteboard-echo"} 197
# HELP outbound_grpc_route_request_statuses Completed request-response streams.
# TYPE outbound_grpc_route_request_statuses counter
outbound_grpc_route_request_statuses_total{parent_group="core",parent_kind="Service",parent_namespace="default",parent_name="whiteboard",parent_port="80",parent_section_name="",route_group="gateway.networking.k8s.io",route_kind="GRPCRoute",route_namespace="default",route_name="whiteboard-echo",grpc_status="OK",error=""} 197
# HELP outbound_grpc_route_backend_requests The total number of requests dispatched.
# TYPE outbound_grpc_route_backend_requests counter
outbound_grpc_route_backend_requests_total{parent_group="core",parent_kind="Service",parent_namespace="default",parent_name="whiteboard",parent_port="80",parent_section_name="",route_group="gateway.networking.k8s.io",route_kind="GRPCRoute",route_namespace="default",route_name="whiteboard-echo",backend_group="core",backend_kind="Service",backend_namespace="default",backend_name="whiteboard",backend_port="80",backend_section_name=""} 197
# HELP outbound_grpc_route_backend_response_duration_seconds The time between request completion and response completion.
# TYPE outbound_grpc_route_backend_response_duration_seconds histogram
# UNIT outbound_grpc_route_backend_response_duration_seconds seconds
outbound_grpc_route_backend_response_duration_seconds_sum{parent_group="core",parent_kind="Service",parent_namespace="default",parent_name="whiteboard",parent_port="80",parent_section_name="",route_group="gateway.networking.k8s.io",route_kind="GRPCRoute",route_namespace="default",route_name="whiteboard-echo",backend_group="core",backend_kind="Service",backend_namespace="default",backend_name="whiteboard",backend_port="80",backend_section_name=""} 0.33726197
outbound_grpc_route_backend_response_duration_seconds_count{parent_group="core",parent_kind="Service",parent_namespace="default",parent_name="whiteboard",parent_port="80",parent_section_name="",route_group="gateway.networking.k8s.io",route_kind="GRPCRoute",route_namespace="default",route_name="whiteboard-echo",backend_group="core",backend_kind="Service",backend_namespace="default",backend_name="whiteboard",backend_port="80",backend_section_name=""} 197
outbound_grpc_route_backend_response_duration_seconds_bucket{le="0.025",parent_group="core",parent_kind="Service",parent_namespace="default",parent_name="whiteboard",parent_port="80",parent_section_name="",route_group="gateway.networking.k8s.io",route_kind="GRPCRoute",route_namespace="default",route_name="whiteboard-echo",backend_group="core",backend_kind="Service",backend_namespace="default",backend_name="whiteboard",backend_port="80",backend_section_name=""} 197
outbound_grpc_route_backend_response_duration_seconds_bucket{le="0.05",parent_group="core",parent_kind="Service",parent_namespace="default",parent_name="whiteboard",parent_port="80",parent_section_name="",route_group="gateway.networking.k8s.io",route_kind="GRPCRoute",route_namespace="default",route_name="whiteboard-echo",backend_group="core",backend_kind="Service",backend_namespace="default",backend_name="whiteboard",backend_port="80",backend_section_name=""} 197
outbound_grpc_route_backend_response_duration_seconds_bucket{le="0.1",parent_group="core",parent_kind="Service",parent_namespace="default",parent_name="whiteboard",parent_port="80",parent_section_name="",route_group="gateway.networking.k8s.io",route_kind="GRPCRoute",route_namespace="default",route_name="whiteboard-echo",backend_group="core",backend_kind="Service",backend_namespace="default",backend_name="whiteboard",backend_port="80",backend_section_name=""} 197
outbound_grpc_route_backend_response_duration_seconds_bucket{le="0.25",parent_group="core",parent_kind="Service",parent_namespace="default",parent_name="whiteboard",parent_port="80",parent_section_name="",route_group="gateway.networking.k8s.io",route_kind="GRPCRoute",route_namespace="default",route_name="whiteboard-echo",backend_group="core",backend_kind="Service",backend_namespace="default",backend_name="whiteboard",backend_port="80",backend_section_name=""} 197
outbound_grpc_route_backend_response_duration_seconds_bucket{le="0.5",parent_group="core",parent_kind="Service",parent_namespace="default",parent_name="whiteboard",parent_port="80",parent_section_name="",route_group="gateway.networking.k8s.io",route_kind="GRPCRoute",route_namespace="default",route_name="whiteboard-echo",backend_group="core",backend_kind="Service",backend_namespace="default",backend_name="whiteboard",backend_port="80",backend_section_name=""} 197
outbound_grpc_route_backend_response_duration_seconds_bucket{le="1.0",parent_group="core",parent_kind="Service",parent_namespace="default",parent_name="whiteboard",parent_port="80",parent_section_name="",route_group="gateway.networking.k8s.io",route_kind="GRPCRoute",route_namespace="default",route_name="whiteboard-echo",backend_group="core",backend_kind="Service",backend_namespace="default",backend_name="whiteboard",backend_port="80",backend_section_name=""} 197
outbound_grpc_route_backend_response_duration_seconds_bucket{le="10.0",parent_group="core",parent_kind="Service",parent_namespace="default",parent_name="whiteboard",parent_port="80",parent_section_name="",route_group="gateway.networking.k8s.io",route_kind="GRPCRoute",route_namespace="default",route_name="whiteboard-echo",backend_group="core",backend_kind="Service",backend_namespace="default",backend_name="whiteboard",backend_port="80",backend_section_name=""} 197
outbound_grpc_route_backend_response_duration_seconds_bucket{le="+Inf",parent_group="core",parent_kind="Service",parent_namespace="default",parent_name="whiteboard",parent_port="80",parent_section_name="",route_group="gateway.networking.k8s.io",route_kind="GRPCRoute",route_namespace="default",route_name="whiteboard-echo",backend_group="core",backend_kind="Service",backend_namespace="default",backend_name="whiteboard",backend_port="80",backend_section_name=""} 197
# HELP outbound_grpc_route_backend_response_statuses Completed responses.
# TYPE outbound_grpc_route_backend_response_statuses counter
outbound_grpc_route_backend_response_statuses_total{parent_group="core",parent_kind="Service",parent_namespace="default",parent_name="whiteboard",parent_port="80",parent_section_name="",route_group="gateway.networking.k8s.io",route_kind="GRPCRoute",route_namespace="default",route_name="whiteboard-echo",backend_group="core",backend_kind="Service",backend_namespace="default",backend_name="whiteboard",backend_port="80",backend_section_name="",grpc_status="OK",error=""} 197
# HELP outbound_grpc_route_retry_limit_exceeded Retryable requests not sent due to retry limits.
# TYPE outbound_grpc_route_retry_limit_exceeded counter
outbound_grpc_route_retry_limit_exceeded_total{parent_group="core",parent_kind="Service",parent_namespace="default",parent_name="whiteboard",parent_port="80",parent_section_name="",route_group="gateway.networking.k8s.io",route_kind="GRPCRoute",route_namespace="default",route_name="whiteboard-echo"} 0
# HELP outbound_grpc_route_retry_overflow Retryable requests not sent due to circuit breakers.
# TYPE outbound_grpc_route_retry_overflow counter
outbound_grpc_route_retry_overflow_total{parent_group="core",parent_kind="Service",parent_namespace="default",parent_name="whiteboard",parent_port="80",parent_section_name="",route_group="gateway.networking.k8s.io",route_kind="GRPCRoute",route_namespace="default",route_name="whiteboard-echo"} 0
# HELP outbound_grpc_route_retry_requests Retry requests emitted.
# TYPE outbound_grpc_route_retry_requests counter
outbound_grpc_route_retry_requests_total{parent_group="core",parent_kind="Service",parent_namespace="default",parent_name="whiteboard",parent_port="80",parent_section_name="",route_group="gateway.networking.k8s.io",route_kind="GRPCRoute",route_namespace="default",route_name="whiteboard-echo"} 0
# HELP outbound_grpc_route_retry_successes Successful responses to retry requests.
# TYPE outbound_grpc_route_retry_successes counter
outbound_grpc_route_retry_successes_total{parent_group="core",parent_kind="Service",parent_namespace="default",parent_name="whiteboard",parent_port="80",parent_section_name="",route_group="gateway.networking.k8s.io",route_kind="GRPCRoute",route_namespace="default",route_name="whiteboard-echo"} 0
I'll leave this issue open until we ensure this issue is fixed in in the ServiceProfile router as well.
IIUC HttpRoute
doesn't work with ServiceProfile
. Does GrpcRoute
not work with ServiceProfile
as well?
Correct, this is a mutually exclusive routing interface.
Apologies for the nudge, but any plan to fix this in ServiceProfile
soon-ish? Thanks!
@Hexcles Hey, nothing concrete yet – we're working out how to get this done.
Hello @Hexcles!
Thank you for your patience regarding a fix for this issue in ServiceProfile
. #3216 recently fixed this bug, and I confirmed that the repro you provided above no longer panics with this patch applied.
That patch will be included in the upcoming weekly edge release. Thank you for filing this issue, and for narrowing the problem down to a concise repro, it was very helpful!
What is the issue?
We saw elevated client errors after enabling retries for some GRPC routes in our service profile. Linkerd metrics show inbound requests are a lot higher than outbound requests for this route. After looking around, we found panics in the logs of linkerd-proxy on the client side.
How can it be reproduced?
(We are trying to produce a minimal, open-source case. FWIW, we use https://square.github.io/wire/wire_grpc/ instead of the standard GRPC.)
Logs, error output, etc
output of
linkerd check -o short
Environment
Possible solution
No response
Additional context
No response
Would you like to work on fixing this bug?
None