grpc / grpc-go

The Go language implementation of gRPC. HTTP/2 based RPC
https://grpc.io
Apache License 2.0
20.89k stars 4.34k forks source link

gRPC Server Sends RST_STREAM without trailers when TCP Reassembly occurs #7623

Open DerekTBrown opened 1 week ago

DerekTBrown commented 1 week ago

What version of gRPC are you using?

v1.59.0

What version of Go are you using (go version)?

1.22.0

What operating system (Linux, Windows, …) and version?

alpine 3.18 AMD64

Error description

I have a grpc-js client that communicates with via an istio service mesh to a service implemented using the grpc-go backend. The client observes an error:

error_message_string: 13 INTERNAL: Received RST_STREAM with code 0 (Call ended without gRPC status)

Upon inspection, I see that the gRPC server is sending RST_STREAM mid-response:

Screenshot 2024-09-11 at 2 05 47 PM

I would expect to see gRPC server send a HEADERS message containing trailers indicating the cause of the failure.

A few additional anecdotes:

arjan-bal commented 1 day ago

As the rst code is 0, it looks like the server successfully responded to the request. It seems strange that the gRPC Go server didn't write the HEADER frame with the trailer metadata. This is the code that handles writing the trailer before sending RST_FRAME with code 0 https://github.com/grpc/grpc-go/blob/cf1fb0a6e81e30f7130bfb20fd40d7929c5d3363/internal/transport/http2_server.go#L1076-L1086

I did find some issues related to the http handler transport:

@DerekTBrown are you using a http handler transport using ServeHTTP or a regular one using Server.Serve?

DerekTBrown commented 1 day ago

@arjan-bal Thanks for the quick rely. This is using Server.Serve.

arjan-bal commented 22 hours ago

There is another code wherein an RST_STREAM with code 0 is sent when the server receives an RST_STREAM from the client first. This doesn't seem to be the case here.

I can't figure out a reason why gRPC Go would not send the trailer just by analysing the code.

@DerekTBrown can you please provide the gRPC Go server logs from the time this issue happens? See the docs for enabling logs. If you can provide a way to repro this issue, it would be even better.