nodejs / llnode

An lldb plugin for Node.js and V8, which enables inspection of JavaScript states for insights into Node.js processes and their core dumps.
Other
1.15k stars 99 forks source link

changes to support Node.js v16 & v18 #404

Closed kvakil closed 1 year ago

kvakil commented 1 year ago

With this diff, our tests are much better. Previously they were entirely failing on Node 16. Now:

I marked the failing tests as skipped / optional.

There are a couple of big commits in this diff. It is best reviewed commit-by-commit. They are in order:

  1. Postmortem data fixes

    • class_Map__constructor_or_backpointer__Object is now class_Map__constructor_or_back_pointer__Object (note the extra _).
    • class_Script__source__Object is weirdly not present in Node 16 but is present in both Node 14 and Node 18. I added a default to the correct value of 8.
  2. ScopeInfo is no longer a FixedArray as of v8/v8@ecaac329. The Length field was also removed in v8/v8@f731e13f.

    • The length field removal shifted everything over by a slot, meaning that a bunch of offsets changed.
    • Since ScopeInfo is no longer a FixedArray, I changed callsites to use HeapObject::LoadFieldValue rather than FixedArray::Get.
  3. Changes to V8 calling conventions

    • The arguments adapter frame no longer exists. Modified frame-test.js to match.
    • Arguments are no longer pushed "in reverse". Modified JSFrame::LeaParamSlot to match.
  4. A minor fix to nodejsVersion, which was returning [NaN] when run on a release build.

  5. Remove support for unboxed doubles, which fixes printing doubles since V8 broke our previous detection mechanism.

  6. Remove a couple of tests which seemed unnecessary (NativeModule and Promise).

cclauss commented 1 year ago

Like:

codecov-commenter commented 1 year ago

Codecov Report

Merging #404 (84a48bb) into main (ff75da7) will decrease coverage by 4.37%. The diff coverage is 86.20%.

@@            Coverage Diff             @@
##             main     #404      +/-   ##
==========================================
- Coverage   77.52%   73.14%   -4.38%     
==========================================
  Files          34       34              
  Lines        5027     5072      +45     
==========================================
- Hits         3897     3710     -187     
- Misses       1130     1362     +232     
Impacted Files Coverage Δ
src/llv8-constants.h 97.14% <ø> (+1.42%) :arrow_up:
src/llv8-inl.h 74.96% <78.18%> (-7.97%) :arrow_down:
src/llv8-constants.cc 82.75% <100.00%> (+1.91%) :arrow_up:
src/llv8.cc 71.23% <100.00%> (-0.71%) :arrow_down:
src/llv8.h 74.13% <100.00%> (ø)
test/addon/jsapi-test.js 97.14% <100.00%> (+0.08%) :arrow_up:
test/common.js 83.33% <100.00%> (-3.51%) :arrow_down:
test/plugin/frame-test.js 84.93% <100.00%> (+1.59%) :arrow_up:
test/plugin/stack-test.js 100.00% <100.00%> (ø)
test/plugin/usage-test.js 100.00% <100.00%> (ø)
... and 17 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

kvakil commented 1 year ago

@No9 this is now ready for review! I split it into four commits, but I can also split it into four separate PRs if you'd prefer that.

No9 commented 1 year ago

Hey @kvakil Stoked you've got so far on this. Now that #389 has landed can you pull main into this PR and edit the push.yml to enable the versions of node that this can support. There was a couple of output format nits with the later versions of LLDB and so it would be good to make sure they don't impact this work.

kvakil commented 1 year ago

I added the later versions (16/18). I am not really sure if this diff means they should be added in CI, since they are still failing(?) But you can see the progress made since they are not completely failing anymore.

I excluded some combinations which looked like they were failing for unrelated reasons (see the comments).

cclauss commented 1 year ago

@No9 Given https://nodejs.org/en/about/releases should unsupported Node.js releases < v14 continue to be tested?

No9 commented 1 year ago

@cclauss We started a discussion on what we can support at the last collab meeting. I've opened an issue #405 to discuss at the next DWG Full Core Dump DWG Agenda - https://github.com/nodejs/diagnostics/issues/576 My initial thought is that we should keep them on the build until they get in the way. Then we can be explicit to the community on why support was removed. Happy to discuss this though.

@kvakil Agree to disable 19 nightly that should be a soft fail anyway - We can pick that up outside of this PR #406 Agree to disable the Ubuntu 18 Node 18 due to glibc compat

It seems like 16 is a lot closer than 18 right now. Given that this PR is to better support 16 I am happy to just enable 16 and look at 18 as a separate PR. If the failing tests are not possible to deliver we can discuss dropping support as part of #405 discussion. [edit] or discuss here if it's easy to communicate Probably by adding and testing an optional property like this Side note: I had to bump the timeouts for later versions of lldb - if you get failing tests because of timeouts feel free to bump again.

kvakil commented 1 year ago

I pushed this along a little further. Some notes:

  1. I commented out the Promise test, because it was not working. Promise support does not seem too useful as implemented. I think we can add the optional field like you say and revisit it.
  2. Aside from that, everything now works for 18.x.
  3. Like you mentioned, later versions of lldb really do seem to be hitting timeouts a lot. I'm not sure why. This seems to account for a lot of the flakiness on the 18.x jobs. So I will probably bump those timeouts more.
  4. There is a lot of cruft in the codebase around unboxed doubles, which don't exist in Node.js v14+. The hack we had to try to detect unboxed/boxed doubles no longer works. We could continue to try to support them, but IMO it is easier just to remove support for Node.js v10/v12. Those are EOL versions anyway; I don't think we need to support them.

Looking through the remaining features which are still broken, I really don't think we need to support them; they all look more like nice-to-haves. I'd much rather just skip them & get this merged in. Thoughts?

No9 commented 1 year ago
  1. OK to take the Promises and NativeModule out with an skip test. Can you capture more notes in the code comments as to why it won't work and I'll open an issue as part of the release. For the tests that are failing in 16 but pass in 18 can we use the skip on version. I haven't started the local checkout review yet but I think this is the list of removed features that i've noted. Please correct/add more as comments in this issue. I'll fill out the User Impact as part of the local PR review when we are green on the build.
Object Status Node Version Comment Expected User Impact
stringifiedError skipped 16
workqueue skipped 14,16,18
Promises skipped 16,18 Look at reinstating
NativeModule skipped 16,18 Look at reinstating
arrow Removed 14,16,18 Removed by No9 Look at reinstating
JSArrayBufferView Removed 14,16,18 Deprecated in V8 None
  1. Great to hear we can land 18 as well as 16 🎉
  2. Please bump the time outs.
    There is a to be a global value for timeouts that isn't respected in all the tests. I've opened an issue to capture this #407
  3. The removal of boxing seems a strong enough case to finish support for less than 14. This can be captured in the release notes.
cclauss commented 1 year ago

Awesome work here! Thanks. Please also copy over the GitHub Action version changes from #403 and then we can close #403 when this PR is merged. That way all these GHA improvements get tested and merged as a unit.

kvakil commented 1 year ago

I pushed this along a little further, skipping basically any failures which I felt weren't important and taking the changes & suggestions from #403. I'll try to capture some more details about the tests I skipped and why I did so later.

I think all tests should be green now, although there is still some flakiness even after bumping the timeouts.

Meta note: I have been pretty sick this past week and am still not feeling great, so may be a bit slow to respond. Feel free to directly push changes to my branch if that's easier.

No9 commented 1 year ago

Re-ran the test and it's now all green :partying_face: I'll start reviewing this but I've got a packed schedule all week so it might slip into the following week.

Meta notes:

  1. Really appreciate the effort to get this PR green but look after yourself first @kvakil It's a great milestone to hit 16/18 building so don't feel obliged to work on this until you're fully ready.
  2. One of the reasons for the slow review is that I am speaking at an SRE conf this week about core dumps. I'll be sure to shout out everyone at this and try and get some new folks to come along to the collab summit session.
RogerKang commented 1 year ago

This works for me on node16 after compiling from source code. Hope it will be merged into master branch asap.

RafaelGSS commented 1 year ago

What are the challenges of using llnode using a compiled version of nodejs? From time to time I see myself trying to use llnode to debug a very weird behaviour on a Pull Request (aka v19.0.0)

No9 commented 1 year ago

@RogerKang thanks for testing - I haven't been able to get the impact assessment done for the release announcement So I might just merge and publish this as 4.0.0 and do the notes as a follow up issue.

kvakil commented 1 year ago

(Meta note: I'm mostly feeling better. :)


As promised, here's a summary of changes in this PR & the user impact:


What are the challenges of using llnode using a compiled version of nodejs? From time to time I see myself trying to use llnode to debug a very weird behaviour on a Pull Request (aka v19.0.0)

I think it should work fine: you need to compile from source & point --nodedir to your nodejs repo:

$ npm install --nodedir=~/node-source .

The Node.js 19 builds are just disabled here because of some CI issue, not because of any intrinsic difficulty.

No9 commented 1 year ago

Excellent - I released 3.3.0 last night to have a release that works with older nodes but newer lldb and verify the signing process. I'm at OSSummit today but will pick this up tonight.