Closed xesunny closed 3 years ago
@xesunny is this in a cloud function? I think you instead:
await client.createTask({parent, task});
Otherwise the cloud function retracts resources once the request is served, and you may get odd behavior.
Here is my exact source code. Here is the error:
@xesunny is this in a cloud function? I think you instead:
await client.createTask({parent, task});
Otherwise the cloud function retracts resources once the request is served, and you may get odd behavior.
I'm using this code in (A) CF (B) Server side application (traditional)
I face this issue on both side. Temporary solution is : I have to submit my task sequentially. For example: If I have 10 tasks to be submitted, I do them one by one.
Avg. time to submit 1 task is 200ms to 500ms. This cause unnecessary delay of ~5 seconds.
Because of this single issue, I can't use cloud task a lot. Though I love the product.
I love tasks, but having to create tasks sequentially makes it unusable for me. I need to schedule too many tasks. I've been batching promises and waiting for them to resolve, but I still get the error. Perhaps I can tweak the batch size some, but if it becomes too slow I'll be forced to look for another solution :(
@bcoe - We're getting this error too. is there a limit for dispatching multiple tasks in concurrent?
@yossi-eynav @isaiah-coleman I'm not sure what the hard limit is on concurrently creating tasks, but I wouldn't be shocked if one exists. My recommendation would be, if it's not reasonable to enqueue them sequentially, do something like this:
const WORK_SIZE = 16; // or 32, etc.
while (work.length) {
const chunkOfWork = work.splice(0, WORK_SIZE);
await Promiser.all(chunkOfWork.map((chunk) => {
return enqueueTask(chunk);
});
}
☝️ this allows you to manage the number of tasks enqueued concurrently, without doing so completely sequentially.
I'm getting this error dispatching a single task.
@ChrisBeeson are you in a similar position as @xesunny describes, in which you enqueue a single task, but there may be multiple users causing this action concurrently.
@bcoe no there are no other users
@ChrisBeeson to me this sounds like a different issue than @xesunny and others are describing, which seems to be related to concurrency limits. I think it might be worth opening a new issue, so that I can help debug (I believe something else is happening to you).
@bcoe thanks! I'm getting this error even when I changed the code do be sequential. where can I get more information about this error?
@yossi-eynav could you please share an example of the code you are running, and let me know what environment you're running in, i.e., Cloud Functions?
@xesunny reading your original message, the error is happening on your local OSX machine?
@bcoe my code is similar to this sample - https://github.com/googleapis/nodejs-tasks/blob/master/samples/createHttpTask.js.
I'm running on GKE
I had the same problem, I think it's because httpRequest.body is expecting bytes, not a string. Try changing:
task.httpRequest.body = Buffer.from(payload).toString('base64');
to
task.httpRequest.body = Buffer.from(payload);
@bcoe
it's related to @grpc/grpc-js
{ Error: 4 DEADLINE_EXCEEDED: Deadline exceeded
at Object.callErrorFromStatus (/Volumes/dev/playground/cloud-task-test/node_modules/@grpc/grpc-js/build/src/call.js:30:26)
at Object.onReceiveStatus (/Volumes/dev/playground/cloud-task-test/node_modules/@grpc/grpc-js/build/src/client.js:174:52)
at Object.onReceiveStatus (/Volumes/dev/playground/cloud-task-test/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:340:141)
at Object.onReceiveStatus (/Volumes/dev/playground/cloud-task-test/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:303:181)
at Http2CallStream.outputStatus (/Volumes/dev/playground/cloud-task-test/node_modules/@grpc/grpc-js/build/src/call-stream.js:114:27)
at Http2CallStream.maybeOutputStatus (/Volumes/dev/playground/cloud-task-test/node_modules/@grpc/grpc-js/build/src/call-stream.js:153:22)
at Http2CallStream.endCall (/Volumes/dev/playground/cloud-task-test/node_modules/@grpc/grpc-js/build/src/call-stream.js:140:18)
at Http2CallStream.cancelWithStatus (/Volumes/dev/playground/cloud-task-test/node_modules/@grpc/grpc-js/build/src/call-stream.js:441:14)
at Timeout.DeadlineFilter.timer.setTimeout (/Volumes/dev/playground/cloud-task-test/node_modules/@grpc/grpc-js/build/src/deadline-filter.js:59:28)
at ontimeout (timers.js:436:11)
@yossi-eynav if you look, a lot of @google-cloud
libraries with a @grpc/grpc-js
dependency have an open issue related to concurrency. The only thing I've read that helps people seems to be upgrading your Node version to v13
. However, that hasn't fixed the issue for me.
see: https://github.com/grpc/grpc-node/issues/1158 it might help
@bcoe @isaiah-coleman
I've set the flag fallback
to be true and now I have 0 errors on creating 1000 tasks in concurrent..
it seems that the problem is with gax-nodejs
library -> can you share the changes between the regular flow and the fallback flow since I'm not familiar with this library at all?
@yossi-eynav I'm not sure I can help you. I'm not a contributor on the library, merely just sharing what seems to have helped others. Can you explain how you used the fallback
flag to fix the errors? It might be a solution I can use.
@yossi-eynav Setting fallback
to true enables a different transport (the one that was initially supposed for browsers) - instead of using gRPC, it serializes your requests and sends them over regular HTTP/1 connection with node-fetch
to a different endpoint. When you enable fallback, you don't do any gRPC requests at all - it uses totally different stack. That might serve as a good workaround, but we'd like to investigate what's going on.
Would it be possible for you to measure the time - how long does it take for it to fail with DEADLINE EXCEEDED
?
@yossi-eynav could you please share an example of the code you are running, and let me know what environment you're running in, i.e., Cloud Functions?
@xesunny reading your original message, the error is happening on your local OSX machine?
Yes, It's happening on my Mac OSX machine but I have got this error on "Google Cloud instance" & "Google Cloud run" as well.
@yossi-eynav Setting
fallback
to true enables a different transport (the one that was initially supposed for browsers) - instead of using gRPC, it serializes your requests and sends them over regular HTTP/1 connection withnode-fetch
to a different endpoint. When you enable fallback, you don't do any gRPC requests at all - it uses totally different stack. That might serve as a good workaround, but we'd like to investigate what's going on.Would it be possible for you to measure the time - how long does it take for it to fail with
DEADLINE EXCEEDED
?
Can you give an example on how to use "fallback: true" ? I don't have much experience with gprc.
@alexander-fenster ~120s (created 1000 tasks in parallel)
@xesunny
const {CloudTasksClient} = require('@google-cloud/tasks');
const client = new CloudTasksClient({ fallback: true })
Are the requests over HTTP/1 regional? Electing for { falback: true } gives me the following
error. {"code":3,"message":"Location 'us-east1' is not a valid location. Use ListLocations to list valid locations.","details":[]}
I know this is the correct region, when i switch out the region I receive the following:
{"code":3,"message":"Location must equal us-east1 because the App Engine app that is associated with this project is located in us-east1","details":[]}
For due diligence this is me checking the ListLocations
$ gcloud tasks locations list NAME FULL_NAME us-east1 projects/[project-name]/locations/us-east1
I'm grateful for any help that can be provided.
@isaiah-coleman i'm on us-center1
am also getting same error : Error: {"code":3,"message":"Location 'asia-south1' is not a valid location. Use ListLocations to list valid locations.","details":[]}
@alexander-fenster @bcoe any update on this?
@yossi-eynav Not yet, it's on my list, please expect an update soon.
Thanks team. Really looking forward for it.
I encountered this problem also and the workaround mentioned above worked.
const client = new CloudTasksClient({ fallback: true })
Though still interested in what is originally causing this error. I only noticed it in the last few days.
`app:///../node_modules/@google-cloud/tasks/node_modules/@grpc/grpc-js/build/src/call.js in Object.callErrorFromStatus at line 30:26
${status.code} ${constants_1.Status[status.code]}: ${status.details}
;
return Object.assign(new Error(message), status);
}
exports.callErrorFromStatus = callErrorFromStatus;
class ClientUnaryCallImpl extends events_1.EventEmitter {
constructor() {
super();
}
cancel() {`Hi there, I too am curious if there are any workarounds for us-east1
, as the fallback approach seemingly cannot be used there. Thanks!
Hi there, just wanted to circle back and see if anyone has made any progress on the above. In particular, how to use the fallback option in us-east1. Thanks!
Hey @alexander-fenster are there any updates you can provide for non us-central users? Google support punted us back here :( this is really biting us in prod. Thanks
@shayne-lp Yes I'm working with support folks as well investigating what's going on here. Please expect to have an update later today or tomorrow. I'm sorry that it affects you folks. There's a lot of moving things there between the client library and the backend, I'l trying to understand what's going on.
I almost had the same error. In my case, I was not awaiting the call. Adding await
or Promise.all()
fixed the issue for me. Here is the detailed SO question: DEADLINE_EXCEEDED while adding tasks to Google Cloud Tasks
Wow thank you @gijo-varghese! Can you folks please check if this applies to your code?
Hi @alexander-fenster, thanks for getting back to us. And thank you for your reply too, @gijo-varghese. We are indeed already using Promise.all()
. After much trial and error, the issue we are seeing is the number of promises you are attempting to resolve dictates if you get the DEADLINE_EXCEEDED
error or not. For example, if you wanted to do 10,000 promises, or even 1,000 promises in bulk, we would get the error. If we do batches of 100, we do not. The other problem is we are unable to catch this error and retry any that failed (it just blows us up).
Any guidance on the upper bounds there (whether dictated by this library, or Google Tasks itself) would be appreciated! Thanks all
Still facing the issue.
I am seeing this issue as well even when I try to queue a single request with a Google Cloud Run service. This is what I am doing:
`
const request = {parent, task};
try {
const [response] = await cloudTasksClient.createTask(request);
console.error('Queued the task:', response.name);
} catch (e) {
console.error('Error queuing the task', e);
}`
I am using "@google-cloud/tasks": "^2.0.0".
Hi, I just wanted to circle back with an update. We tuned the number of promises in a batch down to 100
and used Promise.all()
to wait for them to resolve. This quieted the error for a time, but we experienced the issue again today. There have been no changes to code or infrastructure.
We can continue to fine-tune batch sizes, but without any guidance it's hard to know if it will fix the issue long term. Thanks
@isaiah-coleman @xesunny sorry that its been a little while since an update, we've been discussing this issue internally a bit.
Here's where I end up being a bit confused:
I've created a task queue as follows gcloud tasks queues create echo-function
. I can then run the following code, and have yet to see a DEADLINE_EXCEEDED issue:
const {CloudTasksClient} = require('@google-cloud/tasks')
let client;
async function performTask () {
if (!client) {
client = new CloudTasksClient()
}
const queuePath = client.queuePath('long-door-651', 'us-central1', 'echo-function');
const [resp] = await client.createTask({
parent: queuePath,
task: {
httpRequest: {
httpMethod: 'POST',
headers: {
'Content-Type': 'application/json',
},
url: 'https://.../echo-function',
body: Buffer.from(JSON.stringify({hello: 'world'})),
},
},
});
console.info('we finished creating task', resp);
}
for (let i = 0; i < 2048; i++) {
performTask()
}
☝️ note that this is creating thousands of tasks concurrently.
Here are the exact versions of the dependencies I have installed:
bencoe@bencoe:~/google/tasks-perf$ npm ls
tasks-perf@1.0.0 /usr/local/google/home/bencoe/google/tasks-perf
└─┬ @google-cloud/tasks@2.1.0
└─┬ google-gax@2.6.3
├─┬ @grpc/grpc-js@1.1.2
│ └── semver@6.3.0 deduped
├─┬ @grpc/proto-loader@0.5.5
│ ├── lodash.camelcase@4.3.0
│ └── protobufjs@6.10.1 deduped
├── @types/long@4.0.1
├─┬ abort-controller@3.0.0
│ └── event-target-shim@5.0.1
├─┬ duplexify@3.7.1
│ ├─┬ end-of-stream@1.4.4
│ │ └─┬ once@1.4.0
│ │ └── wrappy@1.0.2
│ ├── inherits@2.0.4
│ ├─┬ readable-stream@2.3.7
│ │ ├── core-util-is@1.0.2
│ │ ├── inherits@2.0.4 deduped
│ │ ├── isarray@1.0.0
│ │ ├── process-nextick-args@2.0.1
│ │ ├── safe-buffer@5.1.2
│ │ ├─┬ string_decoder@1.1.1
│ │ │ └── safe-buffer@5.1.2 deduped
│ │ └── util-deprecate@1.0.2
│ └── stream-shift@1.0.1
├─┬ google-auth-library@6.0.5
│ ├── arrify@2.0.1
│ ├── base64-js@1.3.1
│ ├─┬ ecdsa-sig-formatter@1.0.11
│ │ └── safe-buffer@5.1.2 deduped
│ ├── fast-text-encoding@1.0.3
│ ├─┬ gaxios@3.0.4
│ │ ├── abort-controller@3.0.0 deduped
│ │ ├── extend@3.0.2
│ │ ├─┬ https-proxy-agent@5.0.0
│ │ │ ├─┬ agent-base@6.0.1
│ │ │ │ └── debug@4.1.1 deduped
│ │ │ └── debug@4.1.1 deduped
│ │ ├── is-stream@2.0.0
│ │ └── node-fetch@2.6.0 deduped
│ ├─┬ gcp-metadata@4.1.4
│ │ ├── gaxios@3.0.4 deduped
│ │ └─┬ json-bigint@1.0.0
│ │ └── bignumber.js@9.0.0
│ ├─┬ gtoken@5.0.2
│ │ ├── gaxios@3.0.4 deduped
│ │ ├─┬ google-p12-pem@3.0.2
│ │ │ └── node-forge@0.9.1
│ │ ├── jws@4.0.0 deduped
│ │ └── mime@2.4.6
│ ├─┬ jws@4.0.0
│ │ ├─┬ jwa@2.0.0
│ │ │ ├── buffer-equal-constant-time@1.0.1
│ │ │ ├── ecdsa-sig-formatter@1.0.11 deduped
│ │ │ └── safe-buffer@5.1.2 deduped
│ │ └── safe-buffer@5.1.2 deduped
│ └─┬ lru-cache@6.0.0
│ └── yallist@4.0.0
├── is-stream-ended@0.1.4
├── lodash.at@4.6.0
├── lodash.has@4.5.2
├── node-fetch@2.6.0
├─┬ protobufjs@6.10.1
│ ├── @protobufjs/aspromise@1.1.2
│ ├── @protobufjs/base64@1.1.2
│ ├── @protobufjs/codegen@2.0.4
│ ├── @protobufjs/eventemitter@1.1.0
│ ├─┬ @protobufjs/fetch@1.1.0
│ │ ├── @protobufjs/aspromise@1.1.2 deduped
│ │ └── @protobufjs/inquire@1.1.0 deduped
│ ├── @protobufjs/float@1.0.2
│ ├── @protobufjs/inquire@1.1.0
│ ├── @protobufjs/path@1.1.2
│ ├── @protobufjs/pool@1.1.0
│ ├── @protobufjs/utf8@1.1.0
│ ├── @types/long@4.0.1 deduped
│ ├── @types/node@13.13.14
│ └── long@4.0.0
├─┬ retry-request@4.1.1
│ ├─┬ debug@4.1.1
│ │ └── ms@2.1.2
│ └─┬ through2@3.0.2
│ ├── inherits@2.0.4 deduped
│ └── readable-stream@2.3.7 deduped
├── semver@6.3.0
└── walkdir@0.4.1
And my Node.js runtime v14.5.0
.
@isaiah-coleman @gbhojak @xesunny could you provide me a gist, with this same information:
npm ls
output (if you're running locally).Thanks @bcoe I understand your confusion. The error behavior seems to be non-deterministic. You would have to run the snippet every day for weeks to have the error manifest. We run the gist below daily in sets a couple times a day.
Thanks @bcoe , can you change the location of your queue from us-central1 to asia-northeast1 & then compare the result?
@isaiah-coleman, @xesunny, thank you for continuing to help me dig into this 👍
@xesunny I will try @asia-northeast, is your Cloud Function and/or server also located near this data center?
@isaiah-coleman what location are you in?
@bcoe we're using us-east1
@isaiah-coleman thank you for the reproduction 👍 (and I'm sorry it took me 8 days to respond).
One more point of clarification, in what environment are you running the script? (it sounds like you're not in a GCP environment, as in @xesunny's original report which was on Mac OS).
Once I've confirmed this, I think I have a plan for some folks to bring this issue up with internally.
Thanks for the response @bcoe, we're actually using Cloud Run.
@bcoe I also got this same error today and found this topic interesting.
I'm running my Node.js app on a single GAE (flexible) instance and creating tasks from there. The code is really simple, and I'm not creating tasks concurrently but rather creating sequentially within a while() { .. }
.
while (some_conditions) {
[ ... ]
const [response] = await cloudtask.createTask({parent: parent, task: task});
const name = response.name;
console.log(`Created task ${name}`);
}
I don't think I can easily reproduce this at all as this is the first time I see this error. I've never seen this before in the last few weeks (since I started using Tasks).
GAE location: us-central1
GAE runtime: nodejs12
@google-cloud/tasks version: 2.0.1
Also, in my case above, it seems there was roughly about ~10 seconds stuck time before it raises DEADLINE_EXCEEDED error. I could confirm this from the GAE logging timestamp:
2020-07-31 07:30:33.000 JST Queue id: 5690167314612224 created successfully. 2020-07-31 07:30:43.000 JST (node:25) UnhandledPromiseRejectionWarning: Error: 4 DEADLINE_EXCEEDED: Deadline exceeded at...
Note, the log message on 2020-07-31 07:30:33.000 JST
indicates we successfully created a new Datastore entity. Since we use Express we immediately return next()
after this consolg.log()
line, and move to that function where we create tasks. So it is accurate to say ~10 seconds, I think (no tasks were created at that time == likely hit the error in the first iteration).
Not sure closing the client (i.e., client.close()
, src) would make any difference but thought it's worth trying.
@bcoe @alexander-fenster
@smaeda-ks, @isaiah-coleman, @xesunny, (and other folks in this thread bumping into this issue).
We've opened an internal tracking thread, if would be helpful if people could send me two pieces of information:
DEADLINE_EXCEEDED
error was observed.If you don't feel comfortable sharing in this thread, you can email bencoe @ google.com
.
Environment details
@google-cloud/tasks
version: 1.7.2Steps to reproduce
Thanks!