Open Siddhesh-Swami opened 2 years ago
Any updates please?
We have a similar issue that seems to appear only when our node.js application is deployed on Kubernetes. Here is our stack:
We are getting this error so frequently that it cannot be due to sporadic connectivity issues.
Any updates? We are having that symptom as well
we have similar issue as well:
this normally happened after > 10 hours of idle
something related to keepalive? after adding
keepalive: {
keepaliveTimeMs: ms('5m'),
},
I did not get connection reset for several weeks.
The default keepalive options might be different between grpc
and @grpc/grpc-js
Any updates?
@bangbang93 After changing the code, have you faced the issue again?
@bangbang93 After changing the code, have you faced the issue again?
get rid of this for several months.
@bangbang93 After changing the code, have you faced the issue again?
get rid of this for several months.
@bangbang93 we have the same issues, aren't you afraid the performance could suffer after setting this option? 5 minutes sound like a lot.
This is a comment from the source code:
The amount of time to wait for an acknowledgement after sending a ping
Here is a link: https://github.com/grpc/grpc-node/blob/6764dcc79602faee5457243629da520ba08b726f/packages/grpc-js/src/subchannel.ts#L114
Nevertheless I just applied it to our services, let's see how it will play out.
@bangbang93 After changing the code, have you faced the issue again?
get rid of this for several months.
@bangbang93 we have the same issues, aren't you afraid the performance could suffer after setting this option? 5 minutes sound like a lot.
This is a comment from the source code:
The amount of time to wait for an acknowledgement after sending a ping
Here is a link:
Nevertheless I just applied it to our services, let's see how it will play out.
keepaliveTimeMs
,not keepaliveTimeoutMs
,
https://github.com/grpc/grpc-node/blob/6764dcc79602faee5457243629da520ba08b726f/packages/grpc-js/src/subchannel.ts#L109-L112
@bangbang93 sorry I sent a wrong link. I tried both and I still get the message :(, but thx for helping
This works for us:
const channelOptions: ChannelOptions = {
...channelOptions,
// Send keepalive pings every 10 seconds, default is 2 hours.
'grpc.keepalive_time_ms': 10 * 1000,
// Keepalive ping timeout after 5 seconds, default is 20 seconds.
'grpc.keepalive_timeout_ms': 5 * 1000,
// Allow keepalive pings when there are no gRPC calls.
'grpc.keepalive_permit_without_calls': 1,
};
✌️
Thank you @HofmannZ . Is that fix reliable for you or just makes the problem less evident?
Hey @logidelic,
We ended up with the following config for the client:
// See: https://grpc.github.io/grpc/cpp/md_doc_keepalive.html
const channelOptions: ChannelOptions = {
...channelOptions,
// Send keepalive pings every 6 minutes, default is none.
// Must be more than GRPC_ARG_HTTP2_MIN_RECV_PING_INTERVAL_WITHOUT_DATA_MS on the server (5 minutes.)
'grpc.keepalive_time_ms': 6 * 60 * 1000,
// Keepalive ping timeout after 5 seconds, default is 20 seconds.
'grpc.keepalive_timeout_ms': 5 * 1000,
// Allow keepalive pings when there are no gRPC calls.
'grpc.keepalive_permit_without_calls': 1,
};
And the following config for the server:
// See: https://grpc.github.io/grpc/cpp/md_doc_keepalive.html
const channelOptions: ChannelOptions = {
...channelOptions,
// Send keepalive pings every 10 seconds, default is 2 hours.
'grpc.keepalive_time_ms': 10 * 1000,
// Keepalive ping timeout after 5 seconds, default is 20 seconds.
'grpc.keepalive_timeout_ms': 5 * 1000,
// Allow keepalive pings when there are no gRPC calls.
'grpc.keepalive_permit_without_calls': 1,
};
We've been running it in production for a couple of months, and it works reliably.
Description: we were using @grpc/grpc-js package in the Kubernetes cluster with the alpine image, recently we got the chance to test in production. Sparingly we are observing the read ECONNRESET on the client-side with no logs on the server-side. We switched to an older version of @grpc/grpc-js--1.2.4 but still the error was observed.
In one of the microservices, we used grpc package with nestjs. that service never gave read ECONNRESET. so migrated all the microservices to grpc@1.24.6 package and now we do not face the read ECONNRESET error. The client takes a pretty good amount of time to connect to the server around like 2secs 3secs but no read ECONNRESET error is observed.
Environment:
please tell any more details to add.