googleapis / nodejs-datastore

Node.js client for Google Cloud Datastore: a highly-scalable NoSQL database for your web and mobile applications.
https://cloud.google.com/datastore/
Apache License 2.0
213 stars 102 forks source link

Total timeout of API google.datastore.v1.Datastore exceeded 60000 milliseconds #1176

Open jrabelo-colmeia opened 11 months ago

jrabelo-colmeia commented 11 months ago

Hello, we are getting this error from nodejs-datastore-sdk in our clusters in production from time to time:

Error: Total timeout of API google.datastore.v1.Datastore exceeded 60000 milliseconds before any response was received.

We are having this error for more than one year and no one seems to know what's going on, can you guys at least enlighten us on what could be possibly happening???

1) Is this a client library issue or a product issue? Yes, it looks like a bug into nodejs-datastore-sdk

2) Did someone already solve this?

Environment details

Steps to reproduce

  1. Unfortunately not reproduceable, it happens completely random times in our VM instances at google-cloud-platform
sofisl commented 11 months ago

@danieljbruce, not sure if this is related to b/303109029 and b/303728081

jrabelo-colmeia commented 11 months ago

hello guys, here is a callstack that can help you guys to troubleshoot the problem: image

nlbi21 commented 10 months ago

Hi guys, the same thing is happening to me, it's very random. It started happening today and we haven't made any changes.

  Error: Total timeout of API google.datastore.v1.Datastore exceeded 60000 milliseconds before any response was received.

  at .repeat ( /node_modules/google-gax/build/src/normalCalls/retries.js:66 )
  at .Timeout._onTimeout ( /node_modules/google-gax/build/src/normalCalls/retries.js:101 )
  at .listOnTimeout ( node:internal/timers:569 )
  at process.processTimers ( node:internal/timers:512 )

I have updated my version to see if it solved the problem but it keeps happening.

@google-cloud/datastore 8.2.1 to 8.2.2

Node version: v18.17.0 npm version: 9.6.7

jrabelo-colmeia commented 9 months ago

Hello, any news on any updates in this issue?

Our production servers are simply losing datastore connection and EVERYTHING goes down for a couple of minutes, this is a HUGE problem, if anyone has any idea how to solve this issue we would thanks a lot

rossjs commented 9 months ago

I opened a support ticket and was told that, while the issue isn't resolved, there is a workaround. It was confirmed to be an issue with the Datastore backend and not an issue with this library, but it looks like you can now add a fallback to the options object when creating a new Datastore instance.

const ds = = new Datastore({
  fallback: 'rest',
});

Reference PR

Unfortunately, I also need to use the @google-cloud/connect-datastore library and that is currently locked to @google-cloud/datastore v7 so I'm unable to upgrade at this time. I'm currently using our Cloud SQL database as a fallback workaround.

looker-colmeia commented 9 months ago

thanks for your help @rossjs we are gonna try this fallback parameter

patriciatrauman commented 9 months ago

Hi,

I've tried to add fallback: 'rest' to my Datastore object instanciation and got this error

FetchError: Invalid response body while trying to fetch https://datastore.googleapis.com/v1/projects/[...PROJECT ID...]:lookup?$alt=json%3Benum-encoding=int: read ECONNRESET at Gunzip.<anonymous> ([...PROJECT LOCATION...]/node_modules/google-gax/node_modules/node-fetch/lib/index.js:400:12) at Gunzip.emit (node:events:525:35) at emitErrorNT (node:internal/streams/destroy:151:8) at emitErrorCloseNT (node:internal/streams/destroy:116:3) at process.processTicksAndRejections (node:internal/process/task_queues:82:21) { type: 'system', errno: 'ECONNRESET', code: 'ECONNRESET', note: 'Exception occurred in retry method that was not classified as transient' }

looker-colmeia commented 9 months ago

@patriciatrauman are you using datastore 8.3.0 version of library? @danieljbruce is this PR https://github.com/googleapis/nodejs-datastore/pull/1203/files suposed to solve this issue?

patriciatrauman commented 9 months ago

@looker-colmeia , here are what the package I use

Screenshot 2023-12-20 at 15 34 54

And I tried to implement like this

Screenshot 2023-12-20 at 15 36 38

I also tried with value true, false or proto and I did not find any good way :(

danieljbruce commented 8 months ago

@looker-colmeia The PR you mentioned is the workaround as @rossjs mentioned.

@patriciatrauman The code snippet below using 'rest' works just fine for me. Could you provide us with a reproducible code example?

const {Datastore} = require('@google-cloud/datastore');

async function printResults() {
  const datastore = new Datastore({
    fallback: 'rest'
  });
  const kind = "key";
  const taskKey = datastore.key([kind, 1]);
  const newTask = {
    key: taskKey,
    data: {
      value: 999,
    },
  };
  await datastore.save(newTask, {});
  const [entity] = await datastore.get(taskKey);
  const returnedKey = entity[Datastore.KEY];
  console.log(returnedKey);
}

printResults();
danieljbruce commented 7 months ago

Closing this issue since I have not heard back, but feel free to open this issue again if it persists.

looker-colmeia commented 4 months ago

hello, any news on this issue??? rest fallback is very slow

danieljbruce commented 2 months ago

Lowering priority to P3 since the issue is now just limited to REST fallback.

looker-colmeia commented 2 months ago

Lowering priority to P3 since the issue is now just limited to REST fallback.

I'd like to remind senior-developers of this lib and also program-managers that THIS IS NOT A P3 PROBLEM

REST fallback is SLOW and we are moving away from Datastore and going to ScyllaDB because of timeout/reset problems

We are also thinking about leaving the entire google-cloud-platform, so I hope you guys realize how bad this problem is for your customers

klaa97 commented 1 month ago

Lowering priority to P3 since the issue is now just limited to REST fallback.

Could you expand on this message? We are seeing this in production, in two completely different environments.

REST fallback is just not an option for high-load production environments, it's extremely slow - so the issue is not "limited" to REST fallback - I think it should not even be relevant in this discussion to talk about REST fallback 🤔 .

The issue is clearly in the Datastore client - (I am already in contact with GCP support for this and hopefully our requests will reach library team). If you want to reproduce it, I am positive about the fact that it would be enough to:

looker-colmeia commented 1 month ago

we are also using google-cloud/spanner and google-cloud/big-table and we have no RST problems at all, maybe reading both spanner and big-table codebases would be helpful to solve this problem