Closed jlevon closed 4 years ago
I'd prefer not to add license headers as part of this, at least. It seems common in our node.js repos not to have the license in every file (e.g. node-manta is like this).
I made a couple of additional fixes on top of code review comments. On my test system I hit a core dump with a stack too long for rethinkdb to index the JSON, which uncovered a couple of issues with the existing error handling.
I think this is ready for another code review pass. Thanks.
For some reason I can't reply to your exact comment, but I'm not planning to remove the --abort flag right now. AIUI it doesn't break Mac, and I see no reason to try to support Linux explicitly.
Is that clearer @bahamat ?
Removed the last two FIXMEs. Don't run "make publish" as that will now over-write the latest mainline thoth (it's not ideal...)
I can't test jobs atm as they seem to be down (no servers available)
Re the MDB version thing: you seem to be running on a PI pre-dating d70f65dfb86dedc271c6eacf5767889026db880c (April 2019). In general that's not going to work out too well, especially for crash dumps. Just a particular downside of local debugging. Another reason for a shared debugging host.
Having said that, MDB module API version changes are rare.
Thanks Trent, I took your patches and fixed up the other thing.
you seem to be running on a PI pre-dating d70f65dfb86dedc271c6eacf5767889026db880c (April 2019).
[root@5d4f7599-a991-6b35-dd44-d91936957a6b ~]# uname -v
joyent_20181206T011455Z
I guess this is my fault for hacking my builder zone onto the staging-1 headnode where there was capacity. That headnode happens to be the Triton server being used to test "min_platform" compatibility of components. Boo.
I don't think this need hold up review, but I get a "spawn failed with return code ..." when Ctrl+D
ing to exit thoth debug ...
. E.g.:
[root@trent-builder-1940-x86_64-20200221T185911 ~/joy/manta-thoth]# THOTH_NO_JOBS=1 thoth debug 6d0476cee1215862
thoth: downloading core.node.270503 to local cache
thoth: core.node.270503 [=====================================================================================>] 100% 1.96GB 13.82MB/s 2m25s
mdb_v8 version: 1.4.1 (release, from 0cd139c)
V8 version: 4.5.103.53
Autoconfigured V8 support from target
C++ symbol demangling enabled
> ::jsstack
native: libc.so.1`_lwp_kill+0x15
native: libc.so.1`raise+0x2b
native: libc.so.1`abort+0x10e
native: libstdc++.so.6`__gnu_cxx::__verbose_terminate_handler+0x185
native: libstdc++.so.6`__cxxabiv1::__terminate+0x17
native: libstdc++.so.6`__cxxabiv1::__unexpected
(1 internal frame elided)
(1 internal frame elided)
native: libstdc++.so.6`operator new[]+0x1a
native: int node::StreamBase::WriteString<+0x18f
native: void node::StreamBase::JSMethod<node::StreamWrap, &+0xaa
(1 internal frame elided)
js: createWriteReq
js: <anonymous> (as Socket._writeGeneric)
js: writeOrBuffer
js: <anonymous> (as OutgoingMessage._writeRaw)
js: <anonymous> (as OutgoingMessage._send)
js: <anonymous> (as OutgoingMessage.write)
(1 internal frame elided)
js: _cb
js: formatText
js: format
js: send
(1 internal frame elided)
js: _sendMetrics
js: _afterGetMetrics
(1 internal frame elided)
js: <anonymous> (as next)
(1 internal frame elided)
js: _onGetInfo
js: getZoneInfo
js: _confirmSanity
js: <anonymous> (as <anon>)
js: processImmediate
(1 internal frame elided)
(1 internal frame elided)
native: v8::internal::Execution::Call+0xff
native: v8::Function::Call+0xd7
native: v8::Function::Call+0x3c
native: node::MakeCallback+0xfa
native: node::CheckImmediate+0xa2
native: uv__run_check+0x74
native: uv_run+0x12f
native: node::Start+0x59d
native: main+0x42
native: _start+0x83
>
dump file kept: /var/tmp/thoth/cache/6d0476cee1215862048b12dec9eb3636/core.node.270503
thoth: Error: spawn failed with return code 134
at ChildProcess.<anonymous> (/root/joy/manta-thoth/bin/thoth:2004:9)
at ChildProcess.emit (events.js:311:20)
at maybeClose (internal/child_process.js:1021:16)
at Process.ChildProcess._handle.onexit (internal/child_process.js:286:5)
I don't get that error exit code if I just Ctrl+D
without any dcmds. If I exit the mdb shell after ::jsstack
or ::stack
(or I imagine other commands), then I get that error exit.
Looks like you found an mdb or v8 bug:
$ mdb ~/core.node.270503
Loading modules: [ libumem.so.1 libc.so.1 libnvpair.so.1 ld.so.1 ]
> ::load /home/gk/bad.v8.so
mdb_v8 version: 1.4.1 (release, from 0cd139c)
V8 version: 4.5.103.53
Autoconfigured V8 support from target
C++ symbol demangling enabled
> ::stack
libc.so.1`_lwp_kill+0x15(1, 6, 2df, fe9c5000, fe9c5000, 1)
libc.so.1`raise+0x2b(6)
libc.so.1`abort+0x10e()
libstdc++.so.6`__gnu_cxx::__verbose_terminate_handler+0x185(feea273b, feed5a9c, feea355b, feed5a9c, 9460d70, 94d9ac8)
libstdc++.so.6`__cxxabiv1::__terminate+0x17(feea6680, 1, 8043318, feea35f7, feea35e9, feed5a9c)
libstdc++.so.6`__cxxabiv1::__unexpected(9460d70, fef00600, 8043338, feea387f, feed5a9c, 34a4a1)
0xfeea38ae(9460d90, feef2620, feea1170, feea4e75, feed5a9c, 9727cf0)
0xfeea4ebc(34a4a1, 0, 0, 8047438)
libstdc++.so.6`operator new[]+0x1a(34a4a1, 8047480, 1, b9df4745, b, b9df4675)
int node::StreamBase::WriteString<+0x18f(9727cf0, 8047438, 947f000, 0, 8047488, 92fd55f2)
void node::StreamBase::JSMethod<node::StreamWrap, &+0xaa(8047438, 8f808255, 8047464, 8047484, 2, 0)
0x92f60a34(fd4e2049, 943c008, 8f808099, 8f808099, 8f86c7dd, 95dd5acd)
0x92fcd5e3(8f8651d5, b9df5701, fd4e2049, b9df573d, 8f808099, fce960f1)
0x92f806ef(fd4ad12d, 8f8651d5, b9df5701, 8f808231, fd4e2c3d, fd4ad12d)
0x92fcd977(ac1a8549, 8f8651d5, b9df5701, fd4eaa75, fd4e2c3d, 8f808099)
0x92f1d34d(8f808099, 8f808099, b9df5701, fd4ed42d, b9df5701, b9df56ed)
0xa7641a82(8f808099, 8f808099, 82208081, fd4ed42d, 95dedcb1, b656bd05)
0xa76478ba(8f808099, 8f808099, 82208081, fd4ed42d, 2, 95dedcb1)
0x8061a143(82208081, fd4ed42d, 95dedcb1, b9df4dfd, b9df4dd1, 804764c)
0x92f79652(82208081, 8f808089, 8f808099, 82208081, 8f824719, 0)
0xa7630743(b9df4dfd, 82208081, fd4ed42d, fd4e2cf5, fd4ed42d, a4e4110d)
0xa763132b(b9df4dfd, 82208081, fd4ed42d, 82208081, b65b1d4d, b9df4dfd)
0xa762ff65(8f808099, 8f808099, 82208081, fd4ed42d, 2, b65b1d4d)
0x8061a143(82208081, fd4ed42d, b65b1d4d, fd4f4a71, a4e87bb1, fd4f4a71)
0x92f78251(82208081, 8f808089, 8f808099, fd4f4a71, fd4f4d95, fd4f4d95)
0x92f77f24(8f808089, 8f808099, 4, fd4f4d95, 14, 8047768)
0x8061a143(fd4f4de5, 8f808089, 8f808099, fd4f4d95, fd4f67e5, fd4f49d9)
0x92f0fc5c(8f808099, 8f808089, 8f808099, 2, fd4f49d9, 14)
0x8061a143(8f808089, 8f808099, fd4f49d9, b9df488d, b9df488d, b9df486d)
0x92f77e64(b9df48b1, 8f808089, 8f808099, b9df488d, a4ea282d, b9df4921)
0x92f470d4(b9df488d, fcefaa21, abe0eee9, a4ea282d, fd4f66a5, fd4f66a5)
0x92f77bf8(fd4f49d9, fd4f4dc9, fd4f67e5, fd4f66a5, b9df47dd, b9df47ad)
0x92f10f8a(b9df4831, b9df47dd, ac1673b9, 8f808211, b9df4831, 8f808099)
0x92f10adf(8f86c735, ac1a7385, 8061a921, 10, 0, 8047898)
0x8061a9e1(0, 0, 2, 0, 947e010, 943c008)
0x8061999f(92f10660, ac1a7385, 8f86c735, 0, 0, 943c008)
v8::internal::Execution::Call+0xff(804794c, 943c008, 947e010, 9496cd8, 0, 0)
v8::Function::Call+0xd7(80479cc, 947e010, 947e028, 9496cd8, 0, 0)
v8::Function::Call+0x3c(8047a38, 947e010, 9496cd8, 0, 0, 943c008)
node::MakeCallback+0xfa(8047abc, 94d9ac8, 9496cd8, 947e010, 0, 0)
node::CheckImmediate+0xa2(94d9ad0, 0, 838c6bf, 17c5, 9434f98, 1dc859d6)
uv__run_check+0x74(9366e40, 0, 4, 0, 9366f3c, 9366e50)
uv_run+0x12f(9366e40, 1, fe956890, 8335a3e)
node::Start+0x59d(2, 8047cfc, 4, 400, 9355ff4, 8047ca8)
main+0x42(8047cbc, fe9d2388, 8047cf0, 8314543, 3, 8047cfc)
_start+0x83(3, 8047df0, 8047e4c, 8047e4c, 0, 8047e8b)
> $q
Abort (core dumped)
$ pfexec pstack $(ls -rt /cores/core.mdb* | tail -1 )
core '/cores/core.mdb.880473' of 880473: mdb /home/gk/core.node.270503
fee5d1b3 syscall (3, fed52bcc, 0, fedec75a, fee2fc79, feffca40) + 13
fee49db8 thr_sigsetmask (2, 8045310, 0) + 1f2
fee49e33 sigprocmask (2, 8045310, 0) + 40
fee2fce1 sigrelse (6) + 68
fea743bc umem_do_abort () + 38
fea7448e __umem_assert_failed (fea80e7d, fea810f7, 816d688)
fea76a2c process_free (816d688, 1, 0) + 74
fea76d93 umem_malloc_free (816d688) + 1a
0809ecdf mdb_free (816d688, 38) + 20
080973d8 strfree (816d688) + 1d
08084cf3 mdb_module_remove_walker (8164b70, 88200bd) + 65
08084e45 mdb_module_unload_common (81ea440) + 125
080853d1 mdb_module_unload (81ea440, 2) + e
08084f5e mdb_module_unload_all (2) + 20
0806802a mdb_destroy () + 35
08081260 terminate (0) + 10
080827a6 main (80461fc, feed7528, 8046238) + 1177
080646f7 _start_crt (2, 8046268, fefd0094, 0, 0, 0) + 96
080645ca _start (2, 8046538, 804653c, 0, 8046556, 8046562) + 1a
Hi all, please take a look at the last change too, so we can now update directly from an older thoth. I tested this on my rig
Hi all, I think this is hopefully ready for review. I haven't added testing notes yet, so please let me know if/when you'd like to review those. I'd also appreciate if people could actually try this out on the client side:
npm install joyent/manta-thoth#TOOLS-2440
Note the README changes for the user-visible parts (mainly that crash-dump upload now needs a post-process step).
I have also used sdc-thoth-install on my lab rig to set that up, and check that both the HN and my CN are uploading OK on the regular cron.