Closed RaisinTen closed 1 year ago
The implementation should be something like https://github.com/iovisor/bpftrace/blob/1326f040a0f88287ccbc8c18fe8956bca4cc225d/src/utils.cpp#L1017-L1050. I'll see if I can find any obvious differences. Meanwhile maybe @dsanders11 and @robertgzr could help?
Also, cc @nodejs/single-executable if anyone else also has any clue
The implementation should be something like https://github.com/iovisor/bpftrace/blob/1326f040a0f88287ccbc8c18fe8956bca4cc225d/src/utils.cpp#L1017-L1050. I'll see if I can find any obvious differences. Meanwhile maybe @dsanders11 and @robertgzr could help?
That implementation is a bit different - it's looking for SHT_NOTE
using the section header table (SHT), while Postject's implementation uses PT_NOTE
, which is a note segment. Sections are contained within segments, but they're a linker-time concept. The SHT is not used at run time and can be stripped from the executable. The the Wiki article on ELF:
The segments contain information that is needed for run time execution of the file, while sections contain important data for linking and relocation.
While SHT_NOTE
sections will exist inside of a PT_NOTE
segment, you can't rely on the SHT to find them at run time since that information may be stripped, so Postject walks the segments, rather than sections.
I don't see anything obvious, so I'll try to dig into this later and see what I can find. There might be some slight difference on ppc64le
that's not being accounted for in the current implementation which leads to using the wrong offset for the pointers values.
Hmm, weird find - I'm able to reproduce this on Linux when I compile this on an x64 Ubuntu Linux:
test.cc
using clang but it works fine with gcc. 🤔
$ g++ test.cc
$ ./a.out
Hello world
$ clang++ test.cc
$ ./a.out
Segmentation fault (core dumped)
$ clang++ -g test.cc
$ gdb -q a.out
Reading symbols from a.out...
(gdb) run
Starting program: /home/parallels/Desktop/temp/project/trash/a.out
Program received signal SIGSEGV, Segmentation fault.
0x00000000004015f0 in postject_find_resource (name=0x402004 "foobar", size=0x7fffffffdf00, options=0x0) at ./postject-api.h:141
141 if (note->n_namesz != 0 && note->n_descsz != 0 &&
(gdb) bt
#0 0x00000000004015f0 in postject_find_resource (name=0x402004 "foobar", size=0x7fffffffdf00, options=0x0) at ./postject-api.h:141
#1 0x0000000000401401 in main () at test.cc:21
(gdb) quit
A debugging session is active.
Inferior 1 [process 187727] will be killed.
Quit anyway? (y or n) y
System info:
$ uname -a
Linux parallels-Parallels-Virtual-Platform 5.13.0-40-generic #45~20.04.1-Ubuntu SMP Mon Apr 4 09:38:31 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
$ g++ --version
g++ (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ clang++ --version
clang version 10.0.0-4ubuntu1
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
FWIW, I tried using dl_iterate_phdr
to implement the runtime API for Linux using https://github.com/percona/percona-server/blob/5486efdbebd4e9a6fd94af5410853137a73d551b/mysys/build_id.cc as the base and it doesn't segfault when I compile with clang++.
@dsanders11 I'll send a PR for this soon if you're not aware of anything obviously wrong with function which I haven't considered.
The crash originates from here: https://github.com/nodejs/postject/blob/35343439cac8c488f2596d7c4c1dddfec1fddcae/postject-api.h#L141
while dereferencing the
note
pointer. Note thatnote
is not a null pointer here.This is happening for the case where the resource hasn't been injected into the node binary.
~This is one of the blockers for the single-executable PR in core - https://github.com/nodejs/node/pull/45038.~ I think calling
postject_has_resource()
first would also unblock that PR.Refs: https://github.com/nodejs/build/issues/3168