clinicjs / node-clinic

Clinic.js diagnoses your Node.js performance issues
https://clinicjs.org
MIT License
5.68k stars 125 forks source link

Flamegraphs is not working properly in v16+ #293

Closed RafaelGSS closed 3 years ago

RafaelGSS commented 3 years ago

Current Behavior

@mcollina has an insight about the flame graphs is not working properly in the latest versions of Node.js. I'll be looking at it and update in this issue.

RafaelGSS commented 3 years ago

I was checking all the tools before going to the flamegraphs and I've found a bug inside bubbleprof:

./bin.js bubbleprof --open=false --autocannon [ / --method GET ] -- node server.js

On node v14 it works perfectly, however, at v16 the spawned process never exits for two main reasons:

1.

2.

So I would say that we are dealing with 2 different issues in my point of view. I'm still debugging and trying to create a reproducible example. If you have any tips, let me know!

What I have tried so far:

UPDATE: The SIGINT issue was related to my OS, just fixed and working fine

mcollina commented 3 years ago

Keep going! I won't have time to dig deep into the problem anytime soon unfortunately :(

RafaelGSS commented 3 years ago

Well, I was debugging the:

The stream never emit close event properly. This is never called (at v16) for some unknown reason.

And I've discovered that on v15.8.0 it's working, but in v15.9.0 it starts to fail. Well, after git bisect looks like https://github.com/nodejs/node/pull/37300 has added this behavior.

I also did:

git revert 521c08dee15ce8ada35571c94a4b6883ae00d41b over v15.9.0 and it's working fine. However, since it only affects the auto close application, we've decided to no go through it and keep the focus on the flamegraph differences between those versions.

simoneb commented 3 years ago

Interesting finding @RafaelGSS, though somewhat surprising as that change (at least in my eyes) doesn't seem to carry any functional differences that would explain the change in behavior we're seeing. Do you have any clue as to why this is happening? How likely do you think it may be that the actual reason is something related to that commit, but not specifically that commit?

RafaelGSS commented 3 years ago

@simoneb I'm not sure, but looks like was specifically this commit since if I revert only that one, it works. So, I would say that the write stream was affected, I mean, it's hard to affirm it without an easily reproducible code.

RafaelGSS commented 3 years ago

I did a comparison between flame on v14.15 and v16.6 and they look pretty similar, the biggest difference looks like in the context filters.

Look at the following screenshot:

image

Node v14 and v16 at respectively left and right sides. If you check the filter context we have the following:

image

image


So, I feel that there are no issues so far, only filters adjustment. By the way, I'm assuming that internal/* should be placed as node context instead of v8, right?

Additional considerations

I've also created the profiling through linux_perf just to make sure if there are missed translations through node versions, but they look pretty similar.

mcollina commented 3 years ago

So, I feel that there are no issues so far, only filters adjustment. By the way, I'm assuming that internal/* should be placed as node context instead of v8, right?

Yes exactly!