Open mcaruso85 opened 8 months ago
The reason is this rate limiter, 50121 this is silly!
@bbit8 do you know if there is any way to prevent this problem? Im trying to understand what is this limit, why suddently the client start to send a lot of reset streams? I mean, if that limit is increased, it will not solve the problem, it will just allow to send more reset streams. If this correct?
I add more info that might be helpful, I start to see all the RST STREAMs after I see this error "codec_error:The_user_callback_function_failed" in the envoy side car that runs alongside the client container.
2024-01-27T07:18:14.825341714Z stdout F {"bytes_sent":0,"response_flags":"DPE","user_agent":"grpc-node-js/1.9.14","upstream_service_time":null,"protocol":"HTTP/2","authority":"grpc-server.mcaruso-poc.svc.cluster.local:8081","duration":5240,"upstream_host":"10.244.1.35:8081","x_envoy_external_address":null,"method":"POST","downstream_peer_uri_san":null,"bytes_received":5,"x_forwarded_for":null,"dd.trace_id":null,"downstream_local_address":"10.96.13.98:8081","start_time":"2024-01-27T07:18:09.577Z","upstream_local_address":"10.244.1.36:56188","requested_server_name":null,"connection_termination_details":null,"response_code_details":"codec_error:The_user_callback_function_failed","upstream_cluster":"outbound|8081||grpc-server.mcaruso-poc.svc.cluster.local","downstream_remote_address":"10.244.1.36:34518","upstream_transport_failure_reason":null,"path":"/rpc.bookshop.v1.InventoryService/GetBookList","response_code":0,"request_id":"23cf4d24-9caa-4d24-b3ca-a136b81ae46d","route_name":"default","x_forwarded_proto":"http","x_forwarded_client_cert":null}
@mcaruso85 I just decreased node.js version from 18.18.2 to 18.18.1. Problem Solved!
@bbit8 that didn't work for me, still facing the same problem
Version
21.6.0
Platform
Docker Image: node:21.6.0-bullseye
Subsystem
Istio 1.19.4
What steps will reproduce the bug?
I have a basic grpc js client and a grpc python server. The grpc client it does 60 iterations, in each iteration it fires 300 grpc requests in parallel to the grpc server. The server processing is pretty simple, it sleeps 5 seconds and it respond success. Both grpc client and grp server are in the same namespace in the same eks cluster version 1.25 Currently both grpc client and grpc server have both an istio side car version 1.19.5.
How often does it reproduce? Is there a required condition?
This happens always when the application runs inside a pod with envoy side car. (Istio 1.19.5)
What is the expected behavior? Why is that the expected behavior?
The application should finish all the 60 iterations and fire all the requests and get always and ok response.
What do you see instead?
The script does not reach the 60 iterations. It starts to complete iterations with all responses success, but approx at the iteration 9 It fails with the following error:
Additional information
The timeline order is the following: First I see a lot of requests like this in envoy client side where they are ok:
Then, at some point after a lot of successfull requests like the one above, I see the goaway is sent by the envoy client side:
And after that I see a lot of requests in the envoy client side with the codec error like these:
And after that, the client app receives the goaway, and closes the connection (here the app container logs):
The server logs doesnt show anything about a goaway, this all happens inside the pod of the client, between the envoy and the app container.
When I search this codec_error: The_user_callback_function_failed, I get this:
And I see in the client node app erroring in the callback function: