Open Arun-KumarH opened 3 years ago
Sometimes connections get dropped for reasons outside of the library's control. Just retry the call when that happens. The client will reconnect when you do.
In addition, you say "We are creating the Grpc client instance every time when message is processed." We recommend not doing that. You can just use one client for all of your requests. Using multiple clients isn't even accomplishing anything for you right now. They're all using the same underlying connection anyway.
Try to downgrade to Node version 16.8.0 and see if that helps. We tested Node version 16.9.0 and 16.9.1 with no luck. Downgrading to Node version 16.8.0 works without any problems.
@gillsoftab what exact issues did you have with recent node versions?
Node version 16.9.0 and 16.9.1 same as described in https://github.com/nodejs/node/issues/39683 we also detected issues with an app that is using template (node add-on in c++).
Sometimes connections get dropped for reasons outside of the library's control. Just retry the call when that happens. The client will reconnect when you do.
In addition, you say "We are creating the Grpc client instance every time when message is processed." We recommend not doing that. You can just use one client for all of your requests. Using multiple clients isn't even accomplishing anything for you right now. They're all using the same underlying connection anyway.
I am having the same issue as this, i am in the process of downgrading my docker containers to use node 16.8.0 as suggested. However it is probably a good idea to have retires.
could anyone let me know if there is a way to do this inside @grpc/grpc-js
config options. looking a GRPC .net library i can see there is a MaxAttempts = 5
param. does anyone know if there is something similar in @grpc/grpc-js
library?
all i can see in this library is "grpc.initial_reconnect_backoff_ms"
under channelOptions
so my assumption would be it does retries by default, however it does not seem like this is the case
all i can see in this library is
"grpc.initial_reconnect_backoff_ms"
underchannelOptions
so my assumption would be it does retries by default, however it does not seem like this is the case
Connections are separate things for requests. This is one of the options that controls how the channel re-establishes connections after they are dropped. The @grpc/grpc-js
library does not currently have any built-in support for doing requests automatically, so you will need to do that yourself. For example, you could use an interceptor, as described in the interceptor spec's examples section.
Thanks for the info and the link. Just an FYI to the rest of thee thread, downgrading node to 16.8.0
did not resolve the ECONNRESET [{"message":"14 UNAVAILABLE: read ECONNRESET"}]
issue for me. It is still happening sporadically. What i am finding odd is when developing locally, i never get this error (i develop locally with with node 16.8.0
). When i deploy my services with docker and docker swarm, this is when i get the issue. I have 9 services and i only get this issue with 2 of them, i am yet to see it happen with any of the others
i am using@grpc/grpc-js: 1.3.7
I am using node 16.8.0
and i have downgrade too
"@grpc/grpc-js": "1.2.1",
"@grpc/proto-loader": "0.5.1",
and now the ECONNRESET [{"message":"14 UNAVAILABLE: read ECONNRESET"}]
has seemed to have disappear for me.
One of the 1.3.x releases changed how ECONNRESET
errors were reported. In 1.2.x, they had the error code INTERNAL
instead of UNAVAILABLE
. Are you getting any INTERNAL
errors with your new setup?
Ah damn, yes i am, it has just happend sporadically again. do you think it could be an issue with NestJs (the framework i use). It uses grpc-js under the hood. https://docs.nestjs.com/microservices/grpc
I manage the implement a retry interceptor as shown HERE However it hangs. The if statement condition will resolve to true and retry
will be called, but then it will just hang. no response from onReceiveMessage
or onReceiveStatus
if (response.code !== status.OK) {
console.log('CONNECTION FAILED: retry... ' + retries);
retry(savedSendMessage, savedMetadata);
} else {
console.log('ALL GOOD');
savedMessageNext(savedReceiveMessage);
next(response);
}
At this point, there seems to be no quick fix solution. not sure if it is a library issue or a much deeper issue.
ECONNRESET
is a relatively low-level network error. It's unlikely that it's caused by the framework.
You might want to try this version of the retry interceptor, which was actually tested. Just remove the code that references the registry
, because that's there for the test.
That makes sense, thanks i'll give it a try. Thanks for all the help, appreciate it, been pulling my hair out.
Hey, so i have been poking around a lot today and whilst i do not have an answer i have some findings.
I use docker-compose to bring up all the microservice inside docker containers. When I run these containers on my local machine all the microservices work fine (never get [{"message":"14 UNAVAILABLE: read ECONNRESET"}])`
However when running them on my development network (AWS EC2 instance with all the containers on the same 1 VM) I do sporadically get the [{"message":"14 UNAVAILABLE: read ECONNRESET"}])`. It seems that I get the error randomly, then when i make a call again it will work multiple times. Then if i wait some time I will get the error again.
I enabled grpc-js debugging on the development network and i noticed this SSL error 150] Node error event: message=read ECONNRESET code=ECONNRESET errno=Unknown system error -104 syscall=read
This lead me to ssh'ing into the docker containers on my development network. and using this command line tool to try and connect to one microservice from another. https://github.com/fullstorydev/grpcurl
I found that if i send a message using -plaintext it successfully connects to the microservice every single time. If I use TLS I get first record does not look like a TLS handshake (which is an error from the command line tool).
just out of curiosity do we know if grpc-js has some sort of fallback for if TLS fails?
just out of curiosity do we know if grpc-js has some sort of fallback for if TLS fails?
That's not a thing, in general. If you instruct the client to connect over TLS, it will connect over TLS, or not all.
If the problem is that the response you're getting is not a TLS handshake, are you sure that you are connecting to a TLS server? The port that is serving the plaintext service can't also be serving TLS; are you sure you are connecting to the correct TLS server port when trying to make a request using TLS?
I am unsure, the instantiation of the grpc-js
client happens under the hood in the nest-js
framework. ill dig into the code and take a look.
on another note, i think this issue may possibly be cause by docker swarm
, at this point i am just going through a process of elimination. I have booted a new EC2
instance and installed all of my containers with just docker-compose
and the error has disappeared. I am just in the process of booting up another instance, installing with docker compose and then putting a load balancer in front of the 2 instances (this is my current setup but without swarm) to see if the issues comes back.
The questions I asked are not even necessarily about grpc-js
. You said
This lead me to ssh'ing into the docker containers on my development network. and using this command line tool to try and connect to one microservice from another. https://github.com/fullstorydev/grpcurl
I found that if i send a message using -plaintext it successfully connects to the microservice every single time. If I use TLS I get first record does not look like a TLS handshake (which is an error from the command line tool).
What ports did you connect to for these two tests? On the server you connected to, what port is serving in plaintext and what port is serving in TLS?
Hey @Jaypov, I just want to know if you were able to fix it with that change in docker.
Thanks!
We also face the same issue. Sparingly getting the "message":"14 UNAVAILABLE: read ECONNRESET".
some observations by me : Client Env: Nodejs version:14.16.1 @grpc/grpc-js: 1.3.8 and @grpc/proto-loader: 0.5.6
in some clients: Nodejs version:14.16.1 @grpc/grpc-js: 1.4.4 and @grpc/proto-loader: 0.5.6
Server Env: Nodejs version:14.16.1 @grpc/grpc-js: 1.4.4 and @grpc/proto-loader: 0.5.6
experience the "message":"14 UNAVAILABLE: read ECONNRESET". issue sparingly.
But in one microservice client: the client we use nestjs : Nodejs version:14.16.1 +-- @grpc/proto-loader@0.5.6 +-- grpc@1.24.6
Server: Nodejs version:14.16.1 @grpc/grpc-js: 1.4.4 and @grpc/proto-loader: 0.5.6
never seen read ECONNRESET once in the logs.
migrated all nodejs services to: @grpc/grpc-js: 1.2.4 and @grpc/proto-loader: 0.5.6 but still face this issue
To new commenters, see my first response on this issue:
Sometimes connections get dropped for reasons outside of the library's control. Just retry the call when that happens. The client will reconnect when you do.
We have same problem.
@murgatroid99 Do you have plan to implement interceptor (or something different) in next releases to grpc-js?
The client interceptor API has been available in grpc-js since before the 1.0 release.
Hello, opened a new issue for read ECONNRESET. #1994
We followed this version of interceptor
Still no answer ?
Still no answer ?
Still no answer? We are also having this issue, its quite frustrating because we cant replicate it
I'm running into the same issue with streaming requests, but I have additional info that might help. I'm specifically using the retry interceptor which closely matches the one from the tests in implementation.
I see the following traceback for streaming requests:
Trace: Request path: /object_attributes.object_attributes_grpc/ListAssortmentObjectAttributes; response status code: 0; not eligible for retry.
at Object.onReceiveStatus (/workspace/atlas/common/lib/client/interceptors.ts:135:41)
at InterceptingListenerImpl.onReceiveStatus (/workspace/atlas/node_modules/@grpc/grpc-js/src/call-stream.ts:206:19)
at Object.onReceiveStatus (/workspace/atlas/node_modules/@grpc/grpc-js/src/client-interceptors.ts:424:48)
at /workspace/atlas/node_modules/@grpc/grpc-js/src/call-stream.ts:334:24
at processTicksAndRejections (node:internal/process/task_queues:78:11)
savedMessageNext: undefined
TypeError: savedMessageNext is not a function
at Object.onReceiveStatus (/workspace/atlas/common/lib/client/interceptors.ts:141:33)
at InterceptingListenerImpl.onReceiveStatus (/workspace/atlas/node_modules/@grpc/grpc-js/src/call-stream.ts:206:19)
at Object.onReceiveStatus (/workspace/atlas/node_modules/@grpc/grpc-js/src/client-interceptors.ts:424:48)
at /workspace/atlas/node_modules/@grpc/grpc-js/src/call-stream.ts:334:24
at processTicksAndRejections (node:internal/process/task_queues:78:11)
[ERROR] 10:16:28 TypeError: savedMessageNext is not a function
Looking at InterceptingListenerImpl, it seems that a next
method is not passed to nextListener
which would result in savedMessageNext
being undefined.
I had to link to the 1.7 branch above. I was surprised to see call-stream.ts
is not on master anymore, and then I noticed work on the retry implementation by @murgatroid99. I would gladly try this out on our stack once it is ready and thanks for the great work!
This works for us:
const channelOptions: ChannelOptions = {
// Send keepalive pings every 10 seconds, default is 2 hours.
'grpc.keepalive_time_ms': 10 * 1000,
// Keepalive ping timeout after 5 seconds, default is 20 seconds.
'grpc.keepalive_timeout_ms': 5 * 1000,
// Allow keepalive pings when there are no gRPC calls.
'grpc.keepalive_permit_without_calls': 1,
};
✌️
This is my output when I try to install ESP32 library by Espressif systems. PLEASE HELP!
Downloading packages esp32:xtensa-esp32-elf-gcc@esp-2021r2-patch5-8.4.0 esp32:xtensa-esp32s2-elf-gcc@esp-2021r2-patch5-8.4.0 esp32:xtensa-esp32s3-elf-gcc@esp-2021r2-patch5-8.4.0 esp32:riscv32-esp-elf-gcc@esp-2021r2-patch5-8.4.0 esp32:openocd-esp32@v0.11.0-esp32-20221026 esp32:esptool_py@4.5.1 esp32:mkspiffs@0.2.3 esp32:mklittlefs@3.0.0-gnu12-dc7f933 esp32:esp32@2.0.7 Installing esp32:xtensa-esp32-elf-gcc@esp-2021r2-patch5-8.4.0 Configuring tool. esp32:xtensa-esp32-elf-gcc@esp-2021r2-patch5-8.4.0 installed Installing esp32:xtensa-esp32s2-elf-gcc@esp-2021r2-patch5-8.4.0 Configuring tool. esp32:xtensa-esp32s2-elf-gcc@esp-2021r2-patch5-8.4.0 installed Installing esp32:xtensa-esp32s3-elf-gcc@esp-2021r2-patch5-8.4.0 Configuring tool. esp32:xtensa-esp32s3-elf-gcc@esp-2021r2-patch5-8.4.0 installed Installing esp32:riscv32-esp-elf-gcc@esp-2021r2-patch5-8.4.0 Configuring tool. esp32:riscv32-esp-elf-gcc@esp-2021r2-patch5-8.4.0 installed Installing esp32:openocd-esp32@v0.11.0-esp32-20221026 Configuring tool. esp32:openocd-esp32@v0.11.0-esp32-20221026 installed Installing esp32:esptool_py@4.5.1 Configuring tool. esp32:esptool_py@4.5.1 installed Installing esp32:mkspiffs@0.2.3 Configuring tool. esp32:mkspiffs@0.2.3 installed Installing esp32:mklittlefs@3.0.0-gnu12-dc7f933 Configuring tool. esp32:mklittlefs@3.0.0-gnu12-dc7f933 installed Installing platform esp32:esp32@2.0.7 Failed to install platform: esp32:esp32. Error: 14 UNAVAILABLE: read ECONNRESET
I set the url parameters to localhost:3000
And it works for me
export const grpcClientOptions: ClientOptions = {
transport: Transport.GRPC,
options: {
package: 'hero',
protoPath: join(__dirname, './hero/hero.proto'),
url: 'localhost:3000',
},
}
This works for us:
const channelOptions: ChannelOptions = { // Send keepalive pings every 10 seconds, default is 2 hours. 'grpc.keepalive_time_ms': 10 * 1000, // Keepalive ping timeout after 5 seconds, default is 20 seconds. 'grpc.keepalive_timeout_ms': 5 * 1000, // Allow keepalive pings when there are no gRPC calls. 'grpc.keepalive_permit_without_calls': 1, };
✌️
apply this configuration changed the error meesage I receive to: Error: 14 UNAVAILABLE: Connection dropped, but didn't resolve it. I'm getting it on a client which is streaming values from server after approx. 25 min. Client is implemented in grpc-js, server is implemneted in golang
Client: "@grpc/grpc-js": "^1.10.1", "@grpc/proto-loader": "^0.7.10", nodejs version: 20.5.0
Server: go 1.21.1 google.golang.org/grpc v1.58.1 google.golang.org/protobuf v1.31.0
UPDATE: after settings the server keepalive policy and settings, everything seems to work fine:
kaPolicy := grpc.KeepaliveEnforcementPolicy(keepalive.EnforcementPolicy{
MinTime: 10 * time.Second,
PermitWithoutStream: true, // Allow ping without stream
})
kaParams := grpc.KeepaliveParams(keepalive.ServerParameters{
Time: 10 * time.Second, // The default value is 2 hours.
// After having pinged for keepalive check, the server waits for a duration
// of Timeout and if no activity is seen even after that the connection is
// closed.
Timeout: 5 * time.Second, // The default value is 20 seconds.
})
UPDATE2: nope, didn't help: Error: 14 UNAVAILABLE: Connection dropped
2024-07-17T14:07:34.294Z 38bfa3f1-4ebf-5d89-b784-fce09cdf9fa5 INFO error Error: 14 UNAVAILABLE: read ECONNRESET
at callErrorFromStatus (/var/task/node_modules/@grpc/grpc-js/build/src/call.js:31:19)
at Object.onReceiveStatus (/var/task/node_modules/@grpc/grpc-js/build/src/client.js:193:76)
at Object.onReceiveStatus (/var/task/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:360:141)
at Object.onReceiveStatus (/var/task/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:323:181)
at /var/task/node_modules/@grpc/grpc-js/build/src/resolving-call.js:129:78
at process.processTicksAndRejections (node:internal/process/task_queues:77:11)
for call at
at ServiceClientImpl.makeUnaryRequest (/var/task/node_modules/@grpc/grpc-js/build/src/client.js:161:32)
at ServiceClientImpl.<anonymous> (/var/task/node_modules/@grpc/grpc-js/build/src/make-client.js:105:19)
at Observable._subscribe (/var/task/node_modules/@nestjs/microservices/client/client-grpc.js:195:48)
at Observable._trySubscribe (/var/task/node_modules/rxjs/dist/cjs/internal/Observable.js:41:25)
at /var/task/node_modules/rxjs/dist/cjs/internal/Observable.js:35:31
at Object.errorContext (/var/task/node_modules/rxjs/dist/cjs/internal/util/errorContext.js:22:9)
at Observable.subscribe (/var/task/node_modules/rxjs/dist/cjs/internal/Observable.js:26:24)
at /var/task/src/main.js:1:39406
at new Promise (<anonymous>)
at lastValueFrom (/var/task/src/main.js:1:39367) {
code: 14,
details: 'read ECONNRESET',
metadata: Metadata { internalRepr: Map(0) {}, options: {} }
}
Still seeing this issue intermittent
Hi everyone,
I created a discussion on the google-cloud-node project because I am facing a similar issue with Google Cloud Tasks and App Engine in NestJS.
Can anyone help me resolve it?
Grpc 客户端实例
Hey, I also encountered similar issues when using this library with the Nest framework. Here's my specific version:
{
"grpc": "Version 1.24.11"
}
I also deployed a service on Kubernetes with a single Pod, testing against the same grpc address, and some of the interfaces occasionally produced the following error:
Error: 14 UNAVAILABLE: failed to connect to all addresses
at Object.exports.createStatusError (/app/node_modules/grpc/src/common.js:91:15)
at Object.onReceiveStatus (/app/node_modules/grpc/src/client_interceptors.js:1209:28)
at InterceptingListener._callNext (/app/node_modules/grpc/src/client_interceptors.js:568:42)
at InterceptingListener.onReceiveStatus (/app/node_modules/grpc/src/client_interceptors.js:618:8)
at InterceptingListener._callNext (/app/node_modules/grpc/src/client_interceptors.js:574:12)
at InterceptingListener.onReceiveStatus (/app/node_modules/grpc/src/client_interceptors.js:618:8)
at /app/node_modules/@whale/nacos/lib/interceptor/grpc.interceptors/retry-grpc.interceptor.js:50:33
at runMicrotasks (<anonymous>)
at processTicksAndRejections (internal/process/task_queues.js:95:5)
There is no interaction with the server in the Jaeger tracing, so I'm not sure if this issue is caused by the Grpc client instance. If we increase the number of Pods, this situation can be effectively alleviated, but this may not be a healthy approach.
The grpc
package is deprecated. You should switch to @grpc/grpc-js
.
This is a known optimization issue for us, but since we are using @nestjs/microservices@7.6.18
, which is internally bundled with the grpc
, changing all related dependencies due to size constraints is like trying to turn a battleship around easily, so we hope to find the cause of the problem and, in a timely manner, the technical team's solution is only to increase the Pod.
Problem description
We experience intermittent connection reset errors migrating to grpc-js version "1.3.2". We are creating the Grpc client instance every time when message is processed. On the client side we see the stack trace
ECONNRESET [{"message":"14 UNAVAILABLE: read ECONNRESET"}]
and no log messages on the server side althoughGRPC_TRACE=all
andGRPC_VERBOSITY=DEBUG
on server side as well. We have attached the debug logs seen on the client side. This issue is seen in our production k8s environment and it is intermittent.Environment
@grpc/grpc-js: 1.3.2
and@grpc/proto-loader: 0.6.2
Additional context
Client Logging: