googleapis / nodejs-firestore

Node.js client for Google Cloud Firestore: a NoSQL document database built for automatic scaling, high performance, and ease of application development.
https://cloud.google.com/firestore/
Apache License 2.0
642 stars 149 forks source link

RST_STREAM error keeps showing up #1023

Open jakeleventhal opened 4 years ago

jakeleventhal commented 4 years ago

Environment details

Steps to reproduce

This error keeps appearing over and over in my logs (not regularly reproducible):

Error: 13 INTERNAL: Received RST_STREAM with code 2
at Object.callErrorFromStatus (/api/node_modules/@grpc/grpc-js/src/call.ts:81)
at Object.onReceiveStatus (/api/node_modules/@grpc/grpc-js/src/client.ts:324)
at Object.onReceiveStatus (/api/node_modules/@grpc/grpc-js/src/client-interceptors.ts:439)
at Object.onReceiveStatus (/api/node_modules/@grpc/grpc-js/src/client-interceptors.ts:402)
at Http2CallStream.outputStatus (/api/node_modules/@grpc/grpc-js/src/call-stream.ts:228)
at Http2CallStream.maybeOutputStatus (/api/node_modules/@grpc/grpc-js/src/call-stream.ts:278)
at Http2CallStream.endCall (/api/node_modules/@grpc/grpc-js/src/call-stream.ts:262)
at ClientHttp2Stream.<anonymous> (/api/node_modules/@grpc/grpc-js/src/call-stream.ts:532)
at ClientHttp2Stream.emit (events.js:315)
at ClientHttp2Stream.EventEmitter.emit (domain.js:485)
at emitErrorCloseNT (internal/streams/destroy.js:76)
at processTicksAndRejections (internal/process/task_queues.js:84)
ajitzero commented 4 years ago

I got this error today with an older Angular project where I created a new firebase project.

I fixed it by adding measurementId in the config, which wasn't a requirement in earlier firebase projects.

Hope this helps!

ianmubangizi commented 4 years ago

I only seem to get this @firebase/firestore: Firestore (7.16.0): Connection GRPC stream error. Code: 13 Message: 13 INTERNAL: Received RST_STREAM with code 2 (Internal server error)" log , when I run my ci tests on node 12.X with --detectOpenHandles

My tests pass, with or without the --detectOpenHandles, yet my ci reports without it: Jest did not exit one second after the test run has completed. This usually means that there are asynchronous operations that weren't stopped in your tests. Consider running Jest with --detectOpenHandles to troubleshoot this issue.

I am trying to send off multiple deletes after all tests pass

I am not sure where I may have not handled any async calls, I have been trying to debug this by applying the flag, so in the end firestore hangs my whole ci environment, yet the deletes were written

merlinnot commented 4 years ago

@schmidt-sebastian Are you aware of any back end changes that might have been deployed yesterday after 16:00 UTC? I didn't get any errors for close to 24h now (across all of my services), that's quite surprising - maybe I'm just lucky, but previously these errors have been much more frequent.

schmidt-sebastian commented 4 years ago

@merlinnot I checked with the backend team and there was no rollout in Europe at the time that you mentioned. A rollout did finish the Friday before.

merlinnot commented 4 years ago

It's back. Seems that it happens irregularly, yesterday all of the errors (~50) were between 12:00 and 16:00 UTC, other than that there was not a single occurrence.

@schmidt-sebastian You mentioned previously that you'd need to reproduce it in a fairly short amount of time. I presume you don't store network logs for long, that's understandable. Since I don't know a way to trigger this behavior, but on the other hand it happens quite regularly in my project, maybe we could set up some alerting for it so that you receive a notification whenever it happens? I was thinking of automatically sending an email.

Screen Shot 2020-07-23 at 08 39 16
schmidt-sebastian commented 4 years ago

We have to enable something similar to Cloud Trace just before this error occurs. I will see if I can enable this from a customer environment. I will get back to you tomorrow.

schmidt-sebastian commented 4 years ago

@merlinnot Looking at the tracing some more, it doesn't seem possible to enable the level of tracing externally that we are currently looking for. Cloud Trace may provide some insights, but it might not be worth your time to set it up as we cannot

We are currently asking the team that supports us in the development of the library to come up on a reproduction for this issue. While simply sending a lot of requests worked for me in the old @grpc/grpc-js version, I am no longer reliably able to force this issue to appear. If you have a hunch of a workload that may increase the likelihood of such an error, that might help us. What I am currently trying is to artificially create contention, but my client runs for hours without any problems.

merlinnot commented 4 years ago

I'm not sure if there's anything specific that triggers it. I checked and in my projects, in the last 30 days, 42 separate Cloud Functions, a service on raw VMs (MIG) and a few services on Cloud Run are affected. They run entirely different workflows.

Also, I'm not sure if it's related to contention issues. I've seen the error when e.g. listening to a single document for changes or using a simple set on one document at most every 15 seconds.

leandroz commented 4 years ago

Hi all!

I get this error quite often, but without any specific steps to reproduce, maybe this can help:

image RST_STREAM

dooleyb1 commented 4 years ago

Also experiencing this error when trying to write documents of quite a large size (<1MB as per required by Firestore).

Error: 14 UNAVAILABLE: Stream refused by server
    at Object.callErrorFromStatus (/Users/brandon/github/bounceinsights/firestore-crons/node_modules/@grpc/grpc-js/build/src/call.js:30:26)
    at Http2CallStream.<anonymous> (/Users/brandon/github/bounceinsights/firestore-crons/node_modules/@grpc/grpc-js/build/src/client.js:96:33)
    at Http2CallStream.emit (events.js:215:7)
    at /Users/brandon/github/bounceinsights/firestore-crons/node_modules/@grpc/grpc-js/build/src/call-stream.js:75:22
    at processTicksAndRejections (internal/process/task_queues.js:75:11) {
  code: 14,
  details: 'Stream refused by server',
  metadata: Metadata { internalRepr: Map {}, options: {} }
}
matteobucci commented 4 years ago

Any news on this? I'm getting the same error from the production environment. This happened both using Cloud Functions and Firestore

schmidt-sebastian commented 4 years ago

We are still trying to come up with a faster way to force these errors, which would allow us to trace the TCP packages through our systems. Unfortunately, this is proving to be difficult. These errors are essentially "Internal Server Errors" and could originate in many different parts of our system, all of which handle such a significant amount of traffic that we need a very slim time interval to make sure that we can look at all available data.

Sorry for this non-update.

ururk commented 3 years ago

We are still trying to come up with a faster way to force these errors, which would allow us to trace the TCP packages through our systems. Unfortunately, this is proving to be difficult. These errors are essentially "Internal Server Errors" and could originate in many different parts of our system, all of which handle such a significant amount of traffic that we need a very slim time interval to make sure that we can look at all available data.

Sorry for this non-update.

I have a CRON job that fails with this error every night (runs once, 99% of the time it fails). I've been running it manually the next day, never fails when I run manually (via Cloud Scheduler).

schmidt-sebastian commented 3 years ago

@ururk What environment do you run it in? Is there anything that is different in a manual run?

ururk commented 3 years ago

@schmidt-sebastian Google Cloud functions (via Firebase), Node 10. To invoke it manually I open Cloud Scheduler, and click "Run Now". A day or two ago I deleted my package-lock file, updated all dependencies, and uploaded a new version (no code changes), and it still has this error when run on a schedule. I put a ticket in with Google Cloud support - they suggested sticking with GitHub to resolve.

leandroz commented 3 years ago

I will add additional info, "Exception occurred in retry method that was not classified as transient" is new.

For me this errors brings inconsistencies all over the place because I have counters that depend on the hooks working correctly and not failing as commonly as now.

{ Error: 13 INTERNAL: Received RST_STREAM with code 2 (Internal server error)
    at Object.callErrorFromStatus (/workspace/node_modules/@google-cloud/firestore/node_modules/@grpc/grpc-js/build/src/call.js:31:26)
    at Object.onReceiveStatus (/workspace/node_modules/@google-cloud/firestore/node_modules/@grpc/grpc-js/build/src/client.js:176:52)
    at Object.onReceiveStatus (/workspace/node_modules/@google-cloud/firestore/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:342:141)
    at Object.onReceiveStatus (/workspace/node_modules/@google-cloud/firestore/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:305:181)
    at process.nextTick (/workspace/node_modules/@google-cloud/firestore/node_modules/@grpc/grpc-js/build/src/call-stream.js:124:78)
    at process._tickCallback (internal/process/next_tick.js:61:11)
Caused by: Error
    at WriteBatch.commit (/workspace/node_modules/@google-cloud/firestore/build/src/write-batch.js:419:23)
    at DocumentReference.update (/workspace/node_modules/@google-cloud/firestore/build/src/reference.js:381:14)
    at onTicketMessageCreateHandler (/workspace/lib/handlers/on-ticket-message-create.js:40:90)
    at cloudFunction (/workspace/node_modules/firebase-functions/lib/cloud-functions.js:134:23)
    at Promise.resolve.then (/layers/google.nodejs.functions-framework/functions-framework/node_modules/@google-cloud/functions-framework/build/src/invoker.js:198:28)
    at process._tickCallback (internal/process/next_tick.js:68:7)
  code: 13,
  details: 'Received RST_STREAM with code 2 (Internal server error)',
  metadata: Metadata { internalRepr: Map {}, options: {} },
  note:
   'Exception occurred in retry method that was not classified as transient' }
godprogrammer3 commented 3 years ago

i have the same issue. I run firestore.onSnapshot on nodejs firebase client library and onSnapshort call back it not working after about 2 hours run. i run on raspberry pi 4 and run node app by pm2

jhcordeiro commented 3 years ago

Using cloud functions with Node 10, "firebase-admin": "^9.3.0", "firebase-functions": "^3.11.0",

Error: 2020-10-30 13:09:54.340 EDT <Function Name> 6trbr5cj3lf7 Error: 13 INTERNAL: Received RST_STREAM with code 2 (Internal server error) at Object.callErrorFromStatus (/workspace/node_modules/@grpc/grpc-js/build/src/call.js:31:26) at Object.onReceiveStatus (/workspace/node_modules/@grpc/grpc-js/build/src/client.js:327:49) at Object.onReceiveStatus (/workspace/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:305:181) at process.nextTick (/workspace/node_modules/@grpc/grpc-js/build/src/call-stream.js:124:78) at process._tickCallback (internal/process/next_tick.js:61:11)

This error is affecting all google cloud function calls today after 9:10am PST.

Please, is there any progress in fixing this issue?

matsupa commented 3 years ago

I still have this issue today.

environment: node: 10 firebase-admin: 9.2.0 firebase-function: 3.11.0

it happened in auth onCreate function.

Error: 13 INTERNAL: Received RST_STREAM with code 2 (Internal server error)
at Object.callErrorFromStatus (/workspace/node_modules/@grpc/grpc-js/src/call.ts:81)
at Object.onReceiveStatus (/workspace/node_modules/@grpc/grpc-js/src/client.ts:334)
at Object.onReceiveStatus (/workspace/node_modules/@grpc/grpc-js/src/client-interceptors.ts:434)
at Object.onReceiveStatus (/workspace/node_modules/@grpc/grpc-js/src/client-interceptors.ts:397)
at Http2CallStream.outputStatus (/workspace/node_modules/@grpc/grpc-js/src/call-stream.ts:230)
at Http2CallStream.maybeOutputStatus (/workspace/node_modules/@grpc/grpc-js/src/call-stream.ts:280)
at Http2CallStream.endCall (/workspace/node_modules/@grpc/grpc-js/src/call-stream.ts:263)
at ClientHttp2Stream.stream.on (call-stream.ts:561)
at ClientHttp2Stream.emit (events.js:198)
at ClientHttp2Stream.EventEmitter.emit (domain.js:466)
at emitCloseNT (internal/streams/destroy.js:68)
at emitErrorAndCloseNT (internal/streams/destroy.js:60)
at process._tickCallback (next_tick.js:63)
cchardeau commented 3 years ago

Hello.

I also have this issue with firestore when I run a query.

Here are the logs with GRPC_TRACE=all and GRPC_TRACE=DEBUG:

2020-11-20T12:02:41.686Z | connectivity_state | dns:firestore.googleapis.com:443 CONNECTING -> READY
2020-11-20T12:02:41.687Z | subchannel | 172.217.22.138:443 refcount 2 -> 3
2020-11-20T12:02:41.687Z | subchannel | 2a00:1450:4007:815::200a:443 refcount 2 -> 1
2020-11-20T12:02:41.688Z | subchannel | 172.217.22.138:443 refcount 3 -> 2
2020-11-20T12:02:41.689Z | subchannel | Starting stream with headers
        x-goog-api-client: gax/2.9.2 gapic/4.7.1 gl-node/14.15.1 grpc/1.1.8 gccl/4.7.1
        google-cloud-resource-prefix: projects/test/databases/(default)
        x-goog-request-params: parent=projects%2Ftest%2Fdatabases%2F(default)%2Fdocuments
        authorization: Bearer <hidden>
        grpc-timeout: 299834m
        grpc-accept-encoding: identity,deflate,gzip
        accept-encoding: identity,gzip
        :authority: firestore.googleapis.com:443
        user-agent: grpc-node-js/1.1.8
        content-type: application/grpc
        :method: POST
        :path: /google.firestore.v1.Firestore/RunQuery
        te: trailers
2020-11-20T12:02:41.690Z | call_stream | [14] attachHttp2Stream from subchannel 172.217.22.138:443
2020-11-20T12:02:41.690Z | subchannel | 172.217.22.138:443 callRefcount 0 -> 1
2020-11-20T12:02:41.691Z | call_stream | [14] sending data chunk of length 76 (deferred)
2020-11-20T12:02:41.692Z | call_stream | [14] calling end() on HTTP/2 stream
2020-11-20T12:02:41.771Z | call_stream | [14] HTTP/2 stream closed with code 2
2020-11-20T12:02:41.772Z | call_stream | [14] ended with status: code=13 details="Received RST_STREAM with code 2 (Internal server error)"
2020-11-20T12:02:41.772Z | subchannel | 172.217.22.138:443 callRefcount 1 -> 0
[Nest] 40   - 11/20/2020, 12:02:41 PM   [ExceptionsHandler] 13 INTERNAL: Received RST_STREAM with code 2 (Internal server error) +53278ms
Error: 13 INTERNAL: Received RST_STREAM with code 2 (Internal server error)
    at Object.callErrorFromStatus (/usr/app/node_modules/@grpc/grpc-js/build/src/call.js:31:26)
    at Object.onReceiveStatus (/usr/app/node_modules/@grpc/grpc-js/build/src/client.js:327:49)
    at Object.onReceiveStatus (/usr/app/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:305:181)
    at /usr/app/node_modules/@grpc/grpc-js/build/src/call-stream.js:124:78
    at processTicksAndRejections (internal/process/task_queues.js:75:11)
Caused by: Error
    at Query._get (/usr/app/node_modules/@google-cloud/firestore/build/src/reference.js:1466:23)
    at Query.get (/usr/app/node_modules/@google-cloud/firestore/build/src/reference.js:1455:21)
matthew-petrie commented 3 years ago

We seem to receive 13 INTERNAL: Received RST_STREAM with code 2 (Internal server error) errors sporadically. It seems to be happening for an update() but the vast majority of these updates happen without issue. The most concurrent operations to Firestore is 4 across all collections/documents - so very low volumes of requests.

Example Code:

const { Firestore } = require("@google-cloud/firestore");
const firestore = new Firestore();

const func_used_many times = ({ my_var, date }) => {
  const firestore_doc_ref = firestore.collection(`collection_name`).doc("123");

  const res = await firestore_doc_ref.update({  // error occurs here
    [`key${my_var}`]: date 
  });

  return res;
}

Error Stacktrace:

Error: 13 INTERNAL: Received RST_STREAM with code 2 (Internal server error)
    at Object.callErrorFromStatus (/usr/src/app/node_modules/@grpc/grpc-js/build/src/call.js:31:26)
    at Object.onReceiveStatus (/usr/src/app/node_modules/@grpc/grpc-js/build/src/client.js:176:52)
    at Object.onReceiveStatus (/usr/src/app/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:342:141)
    at Object.onReceiveStatus (/usr/src/app/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:305:181)
    at /usr/src/app/node_modules/@grpc/grpc-js/build/src/call-stream.js:124:78
    at processTicksAndRejections (internal/process/task_queues.js:75:11)
Caused by: Error
    at WriteBatch.commit (/usr/src/app/node_modules/@google-cloud/firestore/build/src/write-batch.js:426:23)
    at DocumentReference.update (/usr/src/app/node_modules/@google-cloud/firestore/build/src/reference.js:381:14)
    at [REDACTED our internal file structure]
    at runMicrotasks (<anonymous>)
    at processTicksAndRejections (internal/process/task_queues.js:93:5)
    at async [REDACTED our internal file structure]

Error Note:

note: "Exception occurred in retry method that was not classified as transient"

GRPC Logging: We have full GRPC logging enabled in GCP Logging but it would be best if we could privately share this with the relevant people.

Environment details: Compute: Google Cloud Run Node.js: v14 (we also received this error in v12) @google-cloud/firestore: v4.7.1 @grpc/grpc-js: package-lock file has a mix of v1.1.1 & v1.1.7

We've had this issue in several services however this is the only one we have managed to pin down with a decent stacktrace.

Note to self - logging id .........PrA6npdhG82Ht2ha7yq85.C

schmidt-sebastian commented 3 years ago

Hey - sorry for the radio silence. I just wanted to ensure everyone that we are still treating this with the priority it deserves.

schmidt-sebastian commented 3 years ago

FYI https://github.com/googleapis/nodejs-firestore/pull/1373 does not fix this issue. It addresses the fact that once the SDK receives RST_STREAM, the next couple of operations will also see this error. With #1373, this will be less likely, as we establish a new GRPC connection to the backend once we see the first RST_STREAM.

We also have a suspicion that there is a lingering problem in @grpc/grpc-js and hope to address it soon.

eli-stewart commented 3 years ago

I am seeing this as well. I am running a Typescript React App with a node.js server. The server is throwing this error fairly regularly. Something like once an hour. I also get this similar error: @firebase/firestore: Firestore (7.19.1): Connection GRPC stream error. Code: 14 Message: 14 UNAVAILABLE: Stream refused by server but about 20% as frequently.

I had a crash on my site on Monday where I was not able to read or write the database and I am trying to figure out if its related.

These are the 3 types of errors I am receiving of this type: @firebase/firestore: Firestore (7.19.1): Connection GRPC stream error. Code: 13 Message: 13 INTERNAL: Received RST_STREAM with code 2

@firebase/firestore: Firestore (7.19.1): Connection GRPC stream error. Code: 14 Message: 14 UNAVAILABLE: Stream refused by server

@firebase/firestore: Firestore (7.19.1): Connection GRPC stream error. Code: 13 Message: 13 INTERNAL: Received RST_STREAM with code 0

penguinsource commented 3 years ago

Using Datastore and Node, I am seeing this issue as well regularly.. almost on a daily basis

image

kaushiksahoo2000 commented 3 years ago

I am seeing this as well with firebase-admin: 9.4.2. I am running a simple apollo node.js server locally, and am unable to interact with my database whatsover due to this error. Error: 13 INTERNAL: Received RST_STREAM with code 2 (Internal server error)

no writes or reads are working. is there any work around to this whatsoever? it's preventing me from deploying any usable code.

im using the following versions: "@google-cloud/functions-framework": "^1.7.1" "firebase-admin": "^9.4.2"

and im using node v12.9.0

also all my reads and writes work perfectly fine with the firebase emulator

liorschwimmer commented 3 years ago

I also have this problem: Error: 13 INTERNAL: Received RST_STREAM with code 2 (Internal server error) node:12 google-cloud/firestore": "4.8.0" code running on CloudRUN

await ... ).then( async () => { await firestore.collection(<COLLECTION_NAME>) .doc(<DOC_ID>) .collection(<SUBCOLLECTION_NAME>) .add(OBJECT) .catch(async err => { ... }); }) .catch(async err => { //Falls here with Error: 13 INTERNAL: Received RST_STREAM with code 2 (Internal server error) });

image

schmidt-sebastian commented 3 years ago

We have updated @grpc/grpc-js to include more error details in the latest version. You should be able to get v1.2.3 of grpc-js by simply re-installing your dependencies (without package locks).

emnnipal commented 3 years ago

Hello, I also experienced this error.

I also tried to retry the operation once I encountered the RST_STREAM error but still failed. Please see the screenshot for your reference.

Error Message: 13 INTERNAL: Received RST_STREAM with code 2 triggered by internal client error: read ECONNRESET

image

nodayoshi commented 3 years ago

In my firestore bulk insert program I got the same error. It seems that the local and remote environment settings were incorrect. When I commented out as below, the error disappeared. (with docker node container)

`if (args.deploy && args.deploy === "dev") { console.log('to Deploy !') const serviceAccount = require('/key.json') admin.initializeApp({ credential: admin.credential.cert(serviceAccount), // databaseURL: 'https://product_id-0001.firebaseio.com' });

var db = admin.firestore() // db.settings({ // host: 'product_id.firebaseio.com', // ssl: true // }); } else { console.log('to local !') let targetUrl = "0.0.0.0:8080" admin.initializeApp({ databaseURL: targetUrl }); var db = admin.firestore() db.settings({ host: targetUrl, ssl: false }); `

firebase-admin@9.4.2 node@14.5.0 npm@6.14.5

simondotm commented 3 years ago

Adding another voice to this thread. We're seeing this error happen increasingly more frequently recently. Similar to @matthew-petrie above it is sporadic for us and does not reliably recur in the same place, although some functions seem to experience it far more than others. Cannot yet see a pattern for what (if anything) code wise might trigger the exception. We have fairly low volume traffic. Anecdotally, a common theme appears to be an exception at a WriteBatch.commit() operation, even if the SDK call was an update or add.

Callstack example from earlier today:

Error: 13 INTERNAL: Received RST_STREAM with code 2 (Internal server error) at Object.callErrorFromStatus (/workspace/node_modules/@grpc/grpc-js/build/src/call.js:31:26) at Object.onReceiveStatus (/workspace/node_modules/@grpc/grpc-js/build/src/client.js:176:52) at Object.onReceiveStatus (/workspace/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:342:141) at Object.onReceiveStatus (/workspace/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:305:181) at process.nextTick (/workspace/node_modules/@grpc/grpc-js/build/src/call-stream.js:124:78) at process._tickCallback (internal/process/next_tick.js:61:11) Caused by: Error at WriteBatch.commit (/workspace/node_modules/@google-cloud/firestore/build/src/write-batch.js:426:23) at DocumentReference.update (/workspace/node_modules/@google-cloud/firestore/build/src/reference.js:381:14) at exports.default.functions.region.firestore.document.onCreate (/workspace/lib/db/metrics/onCreate.fn.js:37:24) at cloudFunction (/workspace/node_modules/firebase-functions/lib/cloud-functions.js:134:23) at Promise.resolve.then (/layers/google.nodejs.functions-framework/functions-framework/node_modules/@google-cloud/functions-framework/build/src/invoker.js:199:28) at process._tickCallback (internal/process/next_tick.js:68:7)

Console screenshot from half an hour ago also shows a slew of recent errors.

image

This might be a long shot and unrelated, but I noted today that the few functions I have that do seem to see this issue regularly have some code that runs an update call back to the same document they were triggered via onCreate. Its ok to do this as far as I know, but I'm just throwing that out there.


export default functions
    .region('europe-west2') // London
    .firestore.document(triggerPath)
    .onCreate( async (snapshot: FirebaseFirestore.DocumentSnapshot, context: functions.EventContext) => {

        new FunctionSession();

        // this trigger adds a "createdAt" field to the document upon creation. 
        // We cant do this when the document is actually created because we are using "set" rather than "add", so we dont know when the document gets created for the first write
        const metric = snapshot.data() as metricsmodel.IMetricsEvent;
        const updates:Partial<metricsmodel.IMetricsEvent> = {
            createdAt: metric.updatedAt ? metric.updatedAt : getTimeStamp()
        }
        await snapshot.ref.update(updates);

        try {
            await MetricsProcessorService.handleMetricOnWrite(metric);
        }
        catch (err)
        {
            console.error("dbMetricsOnCreate() Exception: " + err);
        }

       await FunctionSession.endSessionAsync();

})

Setup: Firebase, cloud functions, firestore.

   "firebase-admin": "^9.4.2",
    "firebase-functions": "^3.13.0"

node 10.

willemjanvankranenburg commented 3 years ago

Hi,

We also experience this issue a lot of times on our customer environments.

Is there anything known about how to fix this? This issue is already open from april the 16th of april 2020

image

fernandolguevara commented 3 years ago

image

same here

assafey commented 3 years ago

Same here:

NodeJS: v12.16 "@google-cloud/firestore": "4.9.4"

Error: 14 UNAVAILABLE: Stream refused by server at Object.callErrorFromStatus (/app/node_modules/@grpc/grpc-js/src/call.ts:81:24) at Object.onReceiveStatus (/app/node_modules/@grpc/grpc-js/src/client.ts:334:36) at Object.onReceiveStatus (/app/node_modules/@grpc/grpc-js/src/client-interceptors.ts:434:34) at Object.onReceiveStatus (/app/node_modules/@grpc/grpc-js/src/client-interceptors.ts:397:48) at Http2CallStream.outputStatus (/app/node_modules/@grpc/grpc-js/src/call-stream.ts:230:22) at Http2CallStream.maybeOutputStatus (/app/node_modules/@grpc/grpc-js/src/call-stream.ts:280:14) at Http2CallStream.endCall (/app/node_modules/@grpc/grpc-js/src/call-stream.ts:263:12) at ClientHttp2Stream.<anonymous> (/app/node_modules/@grpc/grpc-js/src/call-stream.ts:552:14) at ClientHttp2Stream.emit (events.js:310:20) at ClientHttp2Stream.EventEmitter.emit (domain.js:482:12) at emitCloseNT (internal/streams/destroy.js:69:8) at emitErrorAndCloseNT (internal/streams/destroy.js:61:3) at processTicksAndRejections (internal/process/task_queues.js:84:21)

schmidt-sebastian commented 3 years ago

UNAVAILABLE is an error we should already retry. If you see UNAVAILABLE it likely means that all retries failed.

Regarding RST_STREAM, we found at least one issue that we are addressing. This requires a backend rollout and release qualification, so this issue will likely persist for a while. There might also be other causes. Please keep in mind that RST_STREAM really means "Internal Server Error" and that it is very unlikely that we will reliably get this number down to 0. That does not stop us from trying though!

jakeleventhal commented 3 years ago

@schmidt-sebastian I am getting a lot of unavailable lately

schmidt-sebastian commented 3 years ago

What RPCs are you seeing this for?

jakeleventhal commented 3 years ago

@schmidt-sebastian im not sure. i'm just using firestore with node.js

jakeleventhal commented 3 years ago

@schmidt-sebastian

Error: 14 UNAVAILABLE: Connection dropped
  at Object.callErrorFromStatus (/api/node_modules/@grpc/grpc-js/src/call.ts:81)
  at Object.onReceiveStatus (/api/node_modules/@grpc/grpc-js/src/client.ts:570)
  at Object.onReceiveStatus (/api/node_modules/@grpc/grpc-js/src/client-interceptors.ts:389)
  at (/api/node_modules/@grpc/grpc-js/src/call-stream.ts:249)
  at processTicksAndRejections (node:internal/process/task_queues:76)
Caused by: Error
  at Firestore.getAll (/api/node_modules/@google-cloud/firestore/build/src/index.js:797)
  at DocumentReference.get (/api/node_modules/@google-cloud/firestore/build/src/reference.js:201)
samborambo305 commented 3 years ago

I'm getting the same connection dropped error as @jakeleventhal

hyst3ric41 commented 3 years ago

Same as @matthew-petrie. It appears once in a while, and when it occurs, then the error throws a couple of times more and disappears definitively when I restart the Express server. Something weird is when I resend the batch operation(s) to the Firebase Client Library, it "fixes" itself automatically.

ppicom commented 3 years ago

I have the same problem and redeploying does not seem to solve the problem. How much time does it take to disappear for you, @josegpulido ?

schmidt-sebastian commented 3 years ago

@jakeleventhal Based on your stacktrace, the error should be retried (https://github.com/googleapis/nodejs-firestore/blob/master/dev/src/v1/firestore_client_config.json#L77). If these error are a new pattern, do you mind filing a short support ticket so that the backend team can track these errors?

jakeleventhal commented 3 years ago

@schmidt-sebastian I don't mind filling out a support ticket, but where do I do so? And there is no consistent patter, it just seems to be randomly appearing with high throughput operations but then goes away after several hours. For instance, I was trying to perform an operation that updated several thousand documents and I kept getting this over and over. I tried again later a few hours later and it worked.

murgatroid99 commented 3 years ago

I just published @grpc/grpc-js with a possible fix for the errors with the specific message "13 INTERNAL: Received RST_STREAM with code 2 triggered by internal client error: read ECONNRESET". If that fix works, those errors should be reported as UNAVAILABLE instead of INTERNAL by grpc-js, which means that they will generally be retried by upper layers. If you are encountering that error, please update your dependencies to try that fix.

If the fix does not work, setting the environment variables GRPC_TRACE=call_stream and GRPC_VERBOSITY=DEBUG will include logs with more details of the underlying error triggering those messages. If you share those logs, that can help us develop a better fix for this problem.

schmidt-sebastian commented 3 years ago

@jakeleventhal - I was thinking about filing a ticket here: https://cloud.google.com/support-hub

I am mostly interested in patterns that you see (e.g. an increase of an error rate by 10x over the last couple of weeks). If it is very sporadic, then we would need a specific timestamp. Even then, it might not be possible to directly address the issue. The backend operations follow the guarantees provided in https://cloud.google.com/firestore/sla?hl=es-419, and sometimes that means that requests fail.

hyst3ric41 commented 3 years ago

@ppicom Restart it's definitively not a solution... I discovered something: In my case, I have a service that takes a lot of operations called from different places of my API and puts them all into an array, and then builds a batch with these operations to commit. There are three scenarios where the "batch service" is being called:

This error only appears in this last scenario. It's like the other two scenarios are causing this issue in a way that I can't track. I know it's not enough to be useful even if I share my code, but you know, why not 🤷‍♂️. By the way, it's not like it took 5 minutes to resolve or something, it's a little bit random.

hyst3ric41 commented 3 years ago

@ppicom I been testing my code all day and seems like the error doesn't appear when my recurrent event commits the batch after 5 minutes instead of 15. It's like a cold start effect.

samborambo305 commented 3 years ago

I basically cannot launch my business because of this. Should I just rewrite everything in MongoDB?

hyst3ric41 commented 3 years ago

@bitcoinbullbullbull Probably... This and other issues are opened from several months ago. In my case, I had to proxy all of these firebase operations to cloud functions...