naugtur / blocked-at

Detects node eventloop block and reports where it started
MIT License
321 stars 25 forks source link

hey, you. Post stacktrace examples #5

Open naugtur opened 7 years ago

naugtur commented 7 years ago

Hi,

If you're using this library, post links to stacktraces here. The more examples we have, the better!

weeco commented 7 years ago

Since you've asked for it :-). Just tried this library and got lots of blocked messages in bluebird apparently? I am a bit confused about this.

22:29:00 - warn: Blocked for 265.76712100219726ms at [ ' at new Immediate (timers.js:808:5)', ' at createImmediate (timers.js:848:19)', ' at setImmediate (timers.js:835:10)', ' at Async._schedule (C:\Users\weeco\Documents\projectname-express\node_modules\bluebird\js\release\schedule.js:12:53)', ' at Async._queueTick (C:\Users\weeco\Documents\projectname-express\node_modules\bluebird\js\release\async.js:152:14)', ' at AsyncSettlePromises [as settlePromises] (C:\Users\weeco\Documents\projectname-express\node_modules\bluebird\js\release\async.js:88:10)', ' at Promise._fulfill (C:\Users\weeco\Documents\projectname-express\node_modules\bluebird\js\release\promise.js:640:19)', ' at Promise._resolveCallback (C:\Users\weeco\Documents\projectname-express\node_modules\bluebird\js\release\promise.js:432:57)', ' at C:\Users\weeco\Documents\projectname-express\node_modules\bluebird\js\release\promise.js:484:17', ' at Command.resolve (C:\Users\weeco\Documents\projectname-express\node_modules\ioredis\lib\command.js:193:7)', ' at Redis.exports.returnReply (C:\Users\weeco\Documents\projectname-express\node_modules\ioredis\lib\redis\parser.js:169:20)', ' at JavascriptRedisParser.returnReply (C:\Users\weeco\Documents\projectname-express\node_modules\ioredis\lib\redis\parser.js:27:13)', ' at JavascriptRedisParser.execute (C:\Users\weeco\Documents\projectname-express\node_modules\redis-parser\lib\parser.js:574:12)', ' at Socket. (C:\Users\weeco\Documents\projectname-express\node_modules\ioredis\lib\redis\event_handler.js:107:22)', ' at emitOne (events.js:115:13)', ' at Socket.emit (events.js:210:7)', ' at addChunk (_stream_readable.js:252:12)', ' at readableAddChunk (_stream_readable.js:239:11)', ' at Socket.Readable.push (_stream_readable.js:197:10)', ' at TCP.onread (net.js:589:20)' ]

Another one warning about blocked-at itself:

22:33:09 - warn: Blocked for 3150.1368659973145ms at [ ' at new Immediate (timers.js:808:5)', ' at createImmediate (timers.js:848:19)', ' at setImmediate (timers.js:835:10)', ' at dispatchCallback (C:\Users\h9pe\Documents\clashcrown-express\node_modules\blocked-at\index.js:21:43)', ' at AsyncHook.after (C:\Users\h9pe\Documents\clashcrown-express\node_modules\blocked-at\index.js:48:7)', ' at emitAfterNative (async_hooks.js:342:40)' ]

naugtur commented 7 years ago

The second one may be because of IO on printing the first one. With the first one it's hard to tell. Looks like it could be a false positive.

Could you please post these in a gist and link in here? We could discuss in the gist and leave this place for a list of references to examples. I'll remove this comment after you do.

weeco commented 7 years ago

Hmm I use winston for logging. 3150ms seems pretty high for printing a log message too, but it might have been because of too high CPU.

Here there are two gists:

  1. Bluebird: https://gist.github.com/weeco/cf83b0beafb7395be6457c45fb3e0576
  2. Ioredis: https://gist.github.com/weeco/31304ffd7485a77053cab2d12b4c3e16

By the way I get tons of bluebird stacktraces in my application.

naugtur commented 7 years ago

You kinda hijacked this thread to talk about false positives, but that's ok. Please check #1 and let me know if version v1.1.2 with trimFalsePositives works for you.

Hyperblaster commented 6 years ago

Hmm, i have some interesting ones. It often looks like the async_hooks is the cause of the blockage, is that correct? Or am i reading it wrong?

BLOCKED 439ms - STACK: [ '    at emitInitNative (internal/async_hooks.js:131:43)',
  '    at emitInitScript (internal/async_hooks.js:310:3)',
  '    at new TickObject (internal/process/next_tick.js:136:9)',
  '    at process.nextTick (internal/process/next_tick.js:165:10)',
  '    at /node_modules/kareem/index.js:499:20',
  '    at callMiddlewareFunction (/node_modules/kareem/index.js:474:23)',
  '    at next (/node_modules/kareem/index.js:193:9)',
  '    at /node_modules/kareem/index.js:182:9',
  '    at process.nextTick (/node_modules/kareem/index.js:499:38)',
  '    at process._tickCallback (internal/process/next_tick.js:112:11)' ]
naugtur commented 6 years ago

Yup, that makes a lot of sense, as whatever you're doing in async hooks has to be totally synchronous, so even writing a log has to be synchronous. I can't guarantee the blocked time measurements are valid for this case, because blocked-at uses asunc hooks to do the measurements, so it's an unusual situation.

Some of the async hooks performance aspects are still being discussed eg. https://github.com/nodejs/diagnostics/issues/248

Measuring async hooks with async hooks doesn't sound reliable, so this could be ignored. But you might need to apply other measurements. Check out if tracing helps

Sebs030 commented 5 years ago

Hi, here would be another example: Blocked for 4464.071400024414ms, operation started here: [ ' at uploadImage.<anonymous> (<anonymous>)', ' at uploadImage (D:\\home\\site\\wwwroot\\src\\fileUpload.js:85:15)', ' at <anonymous>', ' at process._tickCallback (internal/process/next_tick.js:188:7)' ] In fileUpload.js I am uploading an image to an Azure Blob Storage with the following function: const createImageBlob = async (containerName, blobName, stream, streamLength, res) => { return new Promise((resolve, reject) => { blobService.createBlockBlobFromStream(containerName, blobName, stream, streamLength, (err) => { if (err) { reject(handleError(err)); } else if (res) { resolve(res.json('success')); } else { resolve(); } }); }); }; Image size is an average ~3MB, so yes this takes some time. The fileUpload.js function get´s called from within a request handler for let´s say https://mysite.com/fileupload which gets handled by express as such: function fileUpload() { async function uploadImage(req, res) { const image = req.file.buffer; const blobPath =${req.body.blobFolder}/; const blobName = blobPath + req.file.originalname; const stream = getStream(image); const streamLength = req.file.buffer.length; const imageType = req.body.type; const containerName = 'containerName'; try { if (imageType === 'image/jpeg' || imageType === 'image/jpg' || imageType === 'image/png') { await createContainer(containerName); await createImageBlob(containerName, blobName, stream, streamLength, res); // Do something after Blob has been created } else { // do something } } catch (error) { //do something } res.status(500); } } return { //...and so on }; } Since the page hosted on Azure the duration of handling the request sometimes leads to the following issue: https://github.com/tjanczuk/iisnode/issues/405#issuecomment-290255031 or as described here: https://www.bountysource.com/issues/54202786-random-http-500-1013-errors-from-iisnode - which then lead me to blocked-at ;) I was able to resolve this by await setImmediate(()=> createImageBlob(containerName, blobName, stream, streamLength, res)); ?

BrandonCopley commented 5 years ago

Is this really blocking?

Blocked for 2849.4091750030516ms, operation started here: [ ' at process.nextTick (internal/process/next_tick.js:270:7)', ' at handleOperationCallback (/Users/brandoncopley/github/project/node_modules/mongodb-core/lib/connection/pool.js:531:24)', ' at /Users/brandoncopley/github/project/node_modules/mongodb-core/lib/connection/pool.js:597:9', ' at authenticateStragglers (/Users/brandoncopley/github/project/node_modules/mongodb-core/lib/connection/pool.js:504:16)', ' at Connection.messageHandler (/Users/brandoncopley/github/project/node_modules/mongodb-core/lib/connection/pool.js:540:5)', ' at emitMessageHandler (/Users/brandoncopley/github/project/node_modules/mongodb-core/lib/connection/connection.js:310:10)', ' at TLSSocket. (/Users/brandoncopley/github/project/node_modules/mongodb-core/lib/connection/connection.js:355:13)', ' at emitOne (events.js:116:13)', ' at TLSSocket.emit (events.js:211:7)', ' at addChunk (_stream_readable.js:263:12)', ' at readableAddChunk (_stream_readable.js:250:11)', ' at TLSSocket.Readable.push (_stream_readable.js:208:10)', ' at TLSWrap.onread (net.js:597:20)' ]

naugtur commented 5 years ago

@Sebs030 I'm surprised it's blocking on a file upload and even more surprised with your suggested workaround. Await on setImmeeiate shouldn't do anything afaik.

What you really need to do is get rid of the body parser middleware, skip all the blob magic and pass req as a stream, which it is, to whatever stream accepting function for uploads. Post to stack overflow and edit your comment with a link if you need more.

naugtur commented 5 years ago

@BrandonCopley looks like it shouldn't. Possible explanation is that async hooks are not yet implemented for the asynchronous primitive used here. Try it on newest node possible and see if it's still there

raine commented 5 years ago

I'm getting these locally and on Heroku with node.js v12.8.0. It's an app using express.

2019-09-07T07:27:07.061749+00:00 app[web.1]: level=info message="Blocked for 155.15935900878907ms, operation started here:
2019-09-07T07:27:07.061762+00:00 app[web.1]: at Proxy.stat (fs.js:829:15)
2019-09-07T07:27:07.061763+00:00 app[web.1]: at SendStream.sendFile (/app/node_modules/send/index.js:721:6)
2019-09-07T07:27:07.061764+00:00 app[web.1]: at SendStream.pipe (/app/node_modules/send/index.js:595:8)
2019-09-07T07:27:07.061766+00:00 app[web.1]: at serveStatic (/app/node_modules/serve-static/index.js:125:12)
2019-09-07T07:27:07.061767+00:00 app[web.1]: at Layer.handle [as handle_request] (/app/node_modules/express/lib/router/layer.js:95:5)
2019-09-07T07:27:07.061768+00:00 app[web.1]: at trim_prefix (/app/node_modules/express/lib/router/index.js:317:13)
2019-09-07T07:27:07.061769+00:00 app[web.1]: at /app/node_modules/express/lib/router/index.js:284:7
2019-09-07T07:27:07.061771+00:00 app[web.1]: at Function.process_params (/app/node_modules/express/lib/router/index.js:335:12)
2019-09-07T07:27:07.061772+00:00 app[web.1]: at next (/app/node_modules/express/lib/router/index.js:275:10)
2019-09-07T07:27:07.061773+00:00 app[web.1]: at SendStream.error (/app/node_modules/serve-static/index.js:121:7)
2019-09-07T07:27:07.061774+00:00 app[web.1]: at SendStream.emit (events.js:203:13)
2019-09-07T07:27:07.061775+00:00 app[web.1]: at SendStream.EventEmitter.emit (domain.js:476:20)
2019-09-07T07:27:07.061777+00:00 app[web.1]: at SendStream.error (/app/node_modules/send/index.js:270:17)
2019-09-07T07:27:07.061778+00:00 app[web.1]: at SendStream.onStatError (/app/node_modules/send/index.js:421:12)
2019-09-07T07:27:07.061779+00:00 app[web.1]: at next (/app/node_modules/send/index.js:735:16)
2019-09-07T07:27:07.061782+00:00 app[web.1]: at onstat (/app/node_modules/send/index.js:724:14)
2019-09-07T07:27:07.061783+00:00 app[web.1]: at FSReqCallback.oncomplete (fs.js:165:21)"
2019-09-07T07:27:07.204924+00:00 app[web.1]: level=info message="Blocked for 138.65725999450683ms, operation started here:
2019-09-07T07:27:07.204930+00:00 app[web.1]: at Proxy.stat (fs.js:829:15)
2019-09-07T07:27:07.204931+00:00 app[web.1]: at SendStream.sendFile (/app/node_modules/send/index.js:721:6)
2019-09-07T07:27:07.204933+00:00 app[web.1]: at SendStream.pipe (/app/node_modules/send/index.js:595:8)
2019-09-07T07:27:07.204934+00:00 app[web.1]: at serveStatic (/app/node_modules/serve-static/index.js:125:12)
2019-09-07T07:27:07.204936+00:00 app[web.1]: at Layer.handle [as handle_request] (/app/node_modules/express/lib/router/layer.js:95:5)
2019-09-07T07:27:07.204937+00:00 app[web.1]: at trim_prefix (/app/node_modules/express/lib/router/index.js:317:13)
2019-09-07T07:27:07.204938+00:00 app[web.1]: at /app/node_modules/express/lib/router/index.js:284:7
2019-09-07T07:27:07.204940+00:00 app[web.1]: at Function.process_params (/app/node_modules/express/lib/router/index.js:335:12)
2019-09-07T07:27:07.204941+00:00 app[web.1]: at next (/app/node_modules/express/lib/router/index.js:275:10)
2019-09-07T07:27:07.204942+00:00 app[web.1]: at SendStream.error (/app/node_modules/serve-static/index.js:121:7)
2019-09-07T07:27:07.204943+00:00 app[web.1]: at SendStream.emit (events.js:203:13)
2019-09-07T07:27:07.204944+00:00 app[web.1]: at SendStream.EventEmitter.emit (domain.js:476:20)
2019-09-07T07:27:07.204946+00:00 app[web.1]: at SendStream.error (/app/node_modules/send/index.js:270:17)
2019-09-07T07:27:07.204947+00:00 app[web.1]: at SendStream.onStatError (/app/node_modules/send/index.js:421:12)
2019-09-07T07:27:07.204948+00:00 app[web.1]: at next (/app/node_modules/send/index.js:735:16)
2019-09-07T07:27:07.204949+00:00 app[web.1]: at onstat (/app/node_modules/send/index.js:724:14)
2019-09-07T07:27:07.204950+00:00 app[web.1]: at FSReqCallback.oncomplete (fs.js:165:21)"

Any ideas what to make of it?

edit:

It looks like this means that event loop was blocked while handling an incoming request.

naugtur commented 5 years ago

Yes, it looks like it detected a delay on reading the static file content. Which is weird, as it's generally well implemented. Might be a false positive. Have you tried with trimFalsePositives on? Are you running latest node?

DullReferenceException commented 5 years ago

Here's a weird one that took 630ms. The resource type is TCPCONNECTWRAP:

at GetAddrInfoReqWrap.emitLookup [as callback] (net.js:1019:9),
at GetAddrInfoReqWrap.onlookup [as oncomplete] (dns.js:61:10)

Maybe DNS is getting back an enormous number of records!?

(Amaaaazing tool by the way. This is extremely helpful)

Update:

I think I must have misinterpreted this. Would that mean that the thing that responded to the lookup having completed took that 630ms, not the DNS code itself?

naugtur commented 5 years ago

Most precisely, it means the last function you see in the stacktrace contained code that took that 630ms to do something that wasn't visible as a new asynchronous action to async hooks. One of two things is possible - 1. event loop did not continue for 600ms due to emitLookup causing a blocking operation OR 2. something in emitLookup is an async operation but is not yet instrumented. If you have a way to reproduce it predictably, it'd be very interesting to explore.

nmaves commented 4 years ago

Here is one that I can't seem to decipher

Blocked for 87.84227299499511ms, operation started here: [ ' at Promise.then ()', ' at completeValueCatchingError (/var/app/current/node_modules/graphql/execution/execute.js:491:26)', ' at resolveField (/var/app/current/node_modules/graphql/execution/execute.js:435:10)', ' at executeFields (/var/app/current/node_modules/graphql/execution/execute.js:275:18)', ' at collectAndExecuteSubfields (/var/app/current/node_modules/graphql/execution/execute.js:713:10)', ' at completeObjectValue (/var/app/current/node_modules/graphql/execution/execute.js:703:10)', ' at completeValue (/var/app/current/node_modules/graphql/execution/execute.js:591:12)', ' at completeValue (/var/app/current/node_modules/graphql/execution/execute.js:557:21)', ' at /var/app/current/node_modules/graphql/execution/execute.js:492:16', ' at runMicrotasks ()', ' at processTicksAndRejections (internal/process/task_queues.js:94:5)', ' at async Promise.all (index 0)'

naugtur commented 4 years ago

@nmaves this looks fairly normal. You'd get a better hint if you wrapped your code in named functions you pass to then. But with 80ms this might be a false positive from graphql using some form of async that's not tracked by hooks. Best results in newest node. Try 13 (and let me know if something breaks :D)

logidelic commented 4 years ago

Getting a bunch of these:

blocked - Blocked for 68ms, operation started here:

Server.connectionListener (_http_server.js:396:3),   
Server.emit (events.js:310:20),   
Server.EventEmitter.emit (domain.js:482:12),   
TLSSocket.onServerSocketSecure (_tls_wrap.js:1018:29),   
TLSSocket.emit (events.js:310:20),   
TLSSocket.EventEmitter.emit (domain.js:482:12),   
TLSSocket._finishInit (_tls_wrap.js:917:8),   
TLSWrap.onhandshakedone (_tls_wrap.js:136:9)

resource inf:{"type":"HTTPINCOMINGMESSAGE"}

Any idea about what this might imply?

naugtur commented 4 years ago

68ms is not a lot. The slowdown was pinned down to a http request handler. If it's not happening on every request and you're not seeing impact on real life performance, this is not an issue. To get a more precise stack trace you can try taking the http handler code you're suspecting causes blockage and wrap it in an asynchronous function call.
Again - only if you see potential impact, because 68ms every now and then is not that bad.

logidelic commented 4 years ago

Hey @naugtur . Really appreciate the response! It's true that 68ms "every now and then" is not that bad. However, my server is processing 1000s of requests per second. Throw a few of these in the mix and we have a problem. :)

I did a bit of google searching and found your own article with this particular example (Server.connectionListener), so thanks doubly. :) That said, I'm still unsure where to put my setImmediate in the hopes of giving me more details... In my own http(s) server node server, I explicitly handle the request and upgrade events... Does the above imply that it's starting in one of those?

Thanks again!

naugtur commented 4 years ago

This thread is not the most comfortable place to dig in further. Mail me :)

DanielVip3 commented 4 years ago

It looks like express-hbs using readFile has a false positive?

2020-05-26T01:41:03: Blocked for 212.16068994140625ms, operation started here: [
'    at ReadFileContext.close (internal/fs/read_file_context.js:103:17)',
'    at FSReqCallback.readFileAfterRead [as oncomplete] (internal/fs/read_file_context.js:29:13)'
]

After inspecting this stacktrace with resources, it looks really weird. Because this talks about an fs.readFile function (as we can see from express-hbs's source code). Shouldn't async callback functions like that be non-blocking?

fs.readFile(filename, 'utf8', function(err, source) {
    if (err) return cb(err);

    var template = self.compile(source, filename);
    if (options.cache) {
            self.cache[filename] = {
                source: source,
                template: template
            };
    }
    return cb(null, source, template);
});

that's the incriminated function. Looks like a pretty common readFile call. I'm sure this is the function as theresource.context.callback property returns the same identic callback source code. Is the inner content of the readFile which blocks, or the readFile call itself? Or, perhaps, it's a false positive? I don't know what to do, also because express-hbs states to use file caching, and it doesn't work, but that's another question which I should ask to them; to do that, before, I must be sure this isn't a false positive, 'cause also using trimFalsePositives doesn't seem to trim it. Thanks ^^

naugtur commented 4 years ago

@DanielVip3 Looks like a false positive at first glance, but it might not be the readFile itself. I'd be happy to figure it out if you can provide a minimal case to reproduce - like a repo with only the express-hbs usage in an app that doesn't do anything meaningful.

sandinmyjoints commented 4 years ago

Here's one. This is in a web application using Sequelize (which uses Bluebird as its Promise implementation). This comes from a request that led to a db query that returned a sizable result which got stringified and sent out. I was wondering if there'd be blocking and it looks like there is, but from the stack trace I'm not sure where. JSON.stringify definitely blocks, so that's one possibility, but nothing in this stack trace suggests it's the culprit to my eyes. Based on what I've seen, node-mysql2, which Sequelize uses, shouldn't be blocking on parsing the query result from mysql, but I'm not 100% that it's not. I also wondered if GCs could be happening that could look like blocking.

Blocked for 1086.421099975586ms, operation started here: [
  '    at initAsyncResource (internal/timers.js:153:5)',
  '    at new Immediate (timers.js:220:5)',
  '    at setImmediate (timers.js:276:10)',
  '    at Async.schedule [as _schedule] (/usr/src/app/node_modules/sequelize/node_modules/bluebird/js/release/schedule.js:12:53)',
  '    at Async._queueTick (/usr/src/app/node_modules/sequelize/node_modules/bluebird/js/release/async.js:152:14)',
  '    at AsyncSettlePromises [as settlePromises] (/usr/src/app/node_modules/sequelize/node_modules/bluebird/js/release/async.js:88:10)',
  '    at Promise._fulfill (/usr/src/app/node_modules/sequelize/node_modules/bluebird/js/release/promise.js:640:19)',
  '    at Promise._resolveCallback (/usr/src/app/node_modules/sequelize/node_modules/bluebird/js/release/promise.js:432:57)',
  '    at resolve (/usr/src/app/node_modules/sequelize/node_modules/bluebird/js/release/thenables.js:73:17)',
  '    at processTicksAndRejections (internal/process/task_queues.js:97:5)'
]
naugtur commented 4 years ago

Yeah, this stack is not helpful.

I expect bluebird is the reason here. Async hooks sometimes don't work with custom promise libs. There's a way to make them work and bluebird introduced it not long ago. Check your version, maybe it can be updated to one that works with async hooks.

If not, I'd like to get more details from this if it reproduces, maybe there's something I can do with stack traces parsing.

If you're interested in working on this a bit, get in touch over email and we'll see what I can fix.

sandinmyjoints commented 4 years ago

Thanks for the response! I upgrade bluebird to latest, now I'm seeing this:

Blocked for 1471.5884779999851ms, operation started here: [
  '    at initAsyncResource (internal/timers.js:153:5)',
  '    at new Immediate (timers.js:220:5)',
  '    at setImmediate (timers.js:276:10)',
  '    at Async.schedule [as _schedule] (/usr/src/app/node_modules/sequelize/node_modules/bluebird/js/release/schedule.js:12:53)',
  '    at Async._queueTick (/usr/src/app/node_modules/sequelize/node_modules/bluebird/js/release/async.js:111:14)',
  '    at AsyncSettlePromises [as settlePromises] (/usr/src/app/node_modules/sequelize/node_modules/bluebird/js/release/async.js:76:10)',
  '    at Promise._fulfill (/usr/src/app/node_modules/sequelize/node_modules/bluebird/js/release/promise.js:675:19)',
  '    at Promise._resolveCallback (/usr/src/app/node_modules/sequelize/node_modules/bluebird/js/release/promise.js:466:57)',
  '    at /usr/src/app/node_modules/sequelize/node_modules/bluebird/js/release/promise.js:519:17',
  '    at Query.handler [as onResult] (/usr/src/app/node_modules/sequelize/lib/dialects/mysql/query.js:53:11)',
  '    at /usr/src/app/node_modules/mysql2/lib/commands/query.js:61:14',
  '    at processTicksAndRejections (internal/process/task_queues.js:79:11)'
]

The lines from query.js files in mysql2 and sequelize stand out.

Type is Immediate and resource is undefined.

naugtur commented 4 years ago

To me the top lines stand out because I was supposed to parser them out. Looks like something changed and my code is no longer catching those. Which node version are you on?

Also, try with asynchronous stack traces on (it's a command line flag)

sandinmyjoints commented 4 years ago

I'm on node 12.16.1. I'll give async stack traces a shot. Thanks!

teoring commented 5 months ago

[2024-06-07T15:13:41.679] [ERROR] crtp - Event loop was blockled for 417380.51623339846 ms, operation started here: [2024-06-07T15:13:41.681] [ERROR] [ ' at AsyncHook.init (/home/Development/cr/node_modules/blocked-at/index.js:31:11)', ' at emitInitNative (node:internal/async_hooks:200:43)', ' at emitInitScript (node:internal/async_hooks:503:3)', ' at promiseInitHook (node:internal/async_hooks:322:3)', ' at promiseInitHookWithDestroyTracking (node:internal/async_hooks:326:3)', ' at parsePage (/home/Development/cr/src/parsing/parser.js:584:11)', ' at async Object.parse (/home/Development/cr/src/parsing/parser.js:457:41)', ' at async handler (/home/Development/cr/cr.js:321:30)' ]

For me stack trace wasn't exactly correct. The code was hanging on a next line call but stacktrace was pointing to a previous line.