Closed bdemick closed 2 months ago
Can you try the following in GDB or LLDB:
(gdb) p mx::EntityId(0x84a000078e000001).Unpack()
I'm expecting something like is_definition = true
or is_definition = false
in the following output:
mx::DeclId {
fragment_id = <fragment_id>,
kind = mx::DeclKind::RECORD,
offset = <offset>,
is_definition = true
}
Suppose is_definition
is true
. In the same debugger instance, can inverting it as follows:
(gdb) p/d mx::EntityId(mx::DeclId(<fragment_id>, mx::DeclKind::RECORD, <offset>, false)
This should print out some big decimal integer. Then:
$ mx-highlight-entity --db linux.db --entity_id <printed_id>
Please forgive in advance my lack of experience debugging C++ in gdb :)
InitExample(true)
)p mx::EntityId(0x84a000078e000001).Unpack()
This is resulting in gdb giving A syntax error in expression, near `0x84a000078e000001).Unpack()'.
A colleague has suggested it might be an issue with calling constructors in gdb, or possibly some deeper stuff going on.
Can you try using LLDB?
$ sudo apt-get install lldb
$ lldb ./bin/mx-find-linked-structures
(lldb) r --db linux.db
<wait for it to abort>
(lldb) p mx::EntityId(0x84a000078e000001ull).Unpack()
lldb
FTW. However, that call is returning an mx::VariantId
and that's pretty gnarly (see Details). I do see a (fragment_id = 1079520, kind = RECORD, offset = 1, is_definition = false)
buried in there, though. Throwing that into the suggested command (and inverting the boolean):
(lldb) p/d mx::EntityId(mx::DeclID(1079520, mx::DeclKind::RECORD, 1, true))
error: expression failed to parse:
error: <user expression 1>:1:18: no member named 'DeclID' in namespace 'mx'
mx::EntityId(mx::DeclID(1079520, mx::DeclKind::RECORD, 1, true))
~~~~^
warning: <user expression 1>:1:38: use of enumeration in a nested name specifier is a C++11 extension
mx::EntityId(mx::DeclID(1079520, mx::DeclKind::RECORD, 1, true))
^
(lldb) p/d mx::EntityId(mx::DeclId(1079520, mx::DeclKind::RECORD, 1, true))
error: expression failed to parse:
warning: <user expression 2>:1:38: use of enumeration in a nested name specifier is a C++11 extension
mx::EntityId(mx::DeclId(1079520, mx::DeclKind::RECORD, 1, true))
^
error: <user expression 2>:1:14: no matching constructor for initialization of 'mx::DeclId'
mx::EntityId(mx::DeclId(1079520, mx::DeclKind::RECORD, 1, true))
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
note: candidate constructor (the implicit copy constructor) not viable: requires 1 argument, but 4 were provided
note: candidate constructor (the implicit default constructor) not viable: requires 0 arguments, but 4 were provided
Okay lets try something else:
(lldb) p mx::EntityId(mx::FragmentId(1079520))
then take the outputted number and pass it to:
$ ./bin/mx-print-fragment --db linux.db --fragment_id <outputted_number>
Okay, I have done it in my LLDB ;-) Sorry, it looks like I got some of the syntax slightly wrong!
(lldb) p mx::EntityId(mx::FragmentId(1079520))
(mx::EntityId) (opaque = 2305843009214773472)
(lldb) p mx::EntityId(mx::DeclId{1079520, mx::DeclKind::RECORD, 1, true})
(mx::EntityId) (opaque = 9362983657750396929)
Can you run:
$ ./bin/mx-print-fragment --db linux.db --fragment_id 2305843009214773472
$ ./bin/mx-highlight-entity -db linux.db --entity_id 9362983657750396929
p mx::EntityId(mx::FragmentId(1079520))
(mx::EntityId) $1 = (opaque = 2305843009214773472)
Still seeing an error trying to execute this call:
p mx::EntityId(mx::DeclId{1079520, mx::DeclKind::RECORD, 1, true})
error: expression failed to parse:
error: <user expression 5>:1:24: expected '(' for function-style cast or type construction
mx::EntityId(mx::DeclId{1079520, mx::DeclKind::RECORD, 1, true})
~~~~~~~~~~^
Subsequently from our fragment_id above:
Assuming we're interested in that first identifier for struct netns_mib
, 9362983657750396928:
./mx-highlight-entity --db /home/user0/linux.db --entity_id 9362983657750396928
/home/user0/sources/linux/include/net/netns/mib.h
+---------------------------------------------
7 | struct netns_mib {
8 | DEFINE_SNMP_STAT(struct ipstats_mib, ip_statistics);
9 | #if IS_ENABLED(CONFIG_IPV6)
10 | DEFINE_SNMP_STAT(struct ipstats_mib, ipv6_statistics);
11 | #endif
12 |
13 | DEFINE_SNMP_STAT(struct tcp_mib, tcp_statistics);
14 | DEFINE_SNMP_STAT(struct linux_mib, net_statistics);
15 |
16 | DEFINE_SNMP_STAT(struct udp_mib, udp_statistics);
17 | #if IS_ENABLED(CONFIG_IPV6)
18 | DEFINE_SNMP_STAT(struct udp_mib, udp_stats_in6);
19 | #endif
20 |
21 | #ifdef CONFIG_XFRM_STATISTICS
22 | DEFINE_SNMP_STAT(struct linux_xfrm_mib, xfrm_statistics);
23 | #endif
24 | #if IS_ENABLED(CONFIG_TLS)
25 | DEFINE_SNMP_STAT(struct linux_tls_mib, tls_statistics);
26 | #endif
27 | #ifdef CONFIG_MPTCP
28 | DEFINE_SNMP_STAT(struct mptcp_mib, mptcp_statistics);
29 | #endif
30 |
31 | DEFINE_SNMP_STAT(struct udp_mib, udplite_statistics);
32 | #if IS_ENABLED(CONFIG_IPV6)
33 | DEFINE_SNMP_STAT(struct udp_mib, udplite_stats_in6);
34 | #endif
35 |
36 | DEFINE_SNMP_STAT(struct icmp_mib, icmp_statistics);
37 | DEFINE_SNMP_STAT_ATOMIC(struct icmpmsg_mib, icmpmsg_statistics);
38 | #if IS_ENABLED(CONFIG_IPV6)
39 | DEFINE_SNMP_STAT(struct icmpv6_mib, icmpv6_statistics);
40 | DEFINE_SNMP_STAT_ATOMIC(struct icmpv6msg_mib, icmpv6msg_statistics);
41 | struct proc_dir_entry *proc_net_devsnmp6;
42 | #endif
43 | };
Alright, if you're willing to share, then can you run this script, which was added in the most recent commit to the main branch, which will extract out all files relevant to compiling this specific fragment. I can try to reproduce things locally and see if that helps.
$ python3.12 -m venv /path/to/install
$ source /path/to/install/bin/activate
(install) $ python /path/t
If you haven't built the Python bindings, then you should be able to download the latest pre-built linux release and then run the following:
$ python3.12 -m venv /path/to/extracted/release
$ source /path/to/extracted/release/bin/activate
(release) $ python /path/to/multiplier/source/CompressCompilation.py --db linux.db --entity_id 2305843009214773472 --working_dir /tmp/issue_565
Then that will create a directory /tmp/issue_565
full of copies of the relevant files from your build, and it will also create /tmp/issue_565.tar.gz
. If you can attach that tarball to this issue then that would be great!
Much appreciated - thanks! Here you go: issue_565.tar.gz
Alright I tried reproducing it on macOS to no avail. I was able to make the database and find the relevant structures. I'll be trying next on Linux. It's possible that the issue cannot be isolated to this one translation unit, and manifests as a function of cross-translation unit deduplication.
There was also a curiosity in your compile_commands.json
output that I'll need to try to reproduce and chase down; this was listed as one of the command-line arguments:
"-I\"-resource-dir /usr/lib/llvm-14/lib/clang/14.0.0",
This is most likely a PASTA issue, however. For context, PASTA is the Clang compiler wrapper that Multiplier relies on.
Yeah I think that was part of the steps prescribed to generate the index here: https://github.com/trailofbits/multiplier/blob/main/docs/INDEXING.md, just happened to be the system clang
that was installed.
Another curiosity (and what started me down this path) is that the binary release I installed segfaults, and the debug build I have been testing with aborts. It looks like it's happening with the same translation units. I think the binary release I was testing with on Linux is multiplier-c18052b.tar.xz
.
It would make sense for it to sigsegv here where it is, unfortunately. In general, the multiplier codebase follows the practice of opportunistic assert
ions, i.e. ones that can have a fail-safe fallback in a release build. However, there are some places where fail-safes are impossible, and you're hitting one. For example, if a method returned std::optional<Decl>
then the fail-safe is to
assert(false);
return std::nullopt;
In your case, the return value is meant to be a complete Decl
object, and so there is no value that we can substitute, and so when it goes and does follow-up invocations on the Decl
, it uses a null pointer. The assert
ion failure is just a slightly earlier manifestation of the same underlying problem.
Ah of course - had completely slipped my mind that assert
gets nop'd in release builds.
Just for sanity's sake, I built a release mulitplier
on my local Mac and am still seeing a segfault in the same spot.
Alright, I'm unable to reproduce things on mac/linux with just the tarball provided. If you're willing to, can you share your Linux kbuild config, and the git hash of the version of the kernel you cloned, and if it wasn't the primary torvalds one, then the repo URL? You can optionally share over email to peter ampersand trailofbits period com.
Finally, out of curiosity, have you been satisfied with the time it takes to index code? Were there unexpected hiccups along the way? General feedback on the usability of the tools would be helpful.
c44d83ae4d9d5d4c150fd845af3ef14633767c2e
)make defconfig
on my Ubuntu 22.04.4 system to configure the buildcompile_commands.json
using https://github.com/amezin/vscode-linux-kernel/blob/master/generate_compdb.py (prior to coming across multiplier
) - now that I think about it, maybe this could be part of the culprit?If I recall correctly, it took me roughly 2 hours to generate the index for this Linux kernel, which I think is definitely well within an acceptable and usable timeframe. I used Sourcetrail (RIP) for this a few years ago and it took quite a long time (at least a day, IIRC). I've so far been very impressed with both the baseline examples/utilities. I think a brief API primer would be super useful and help to grok the codebase (especially for those of us with severely out-of-date C++ experience).
All in all, this is pretty awesome. It sure beats grep + Intellisense in terms of flexibility!
I think the Python API is probably where you want to live. When installed, the Python API installs some stub files, which can be used with PyCharm to support auto-completion. A nice thing about using the Python API, which is just as powerful as the C++ API, is that it auto-downcasts things, unlike in C++ where you need to do things like FunctionDecl::from
.
Alright, I am working to reproduce things. Here's my current approach:
wget https://cdn.kernel.org/pub/linux/kernel/v6.x/linux-6.9.10.tar.xz
tar xf linux-6.9.10.tar.xz
cd linux-6.9.10
CCC_OVERRIDE_OPTIONS="# x-Werror" CC=`which clang-14` LD=`which ld.lld-14` make LLVM=1 defconfig
CCC_OVERRIDE_OPTIONS="# x-Werror" CC=`which clang-14` LD=`which ld.lld-14` make LLVM=1 -j48
env > env_vars.txt
git clone git@github.com:amezin/vscode-linux-kernel.git .vscode
python .vscode/generate_compdb.py
mx-index --db linux.db --workspace linux.ws --target compile_commands.json --env env_vars.txt
I'm using an debug build of mx-index
just in case something interesting pops up there.
Reproduced :-D
I think the issue is this. There is a declaration roughly like the following:
struct Foo {
__typeof__(struct Bar) *bar;
};
And suppose there is not yet a forward declaration of struct Bar
. Then the struct Bar
within the __typeof__
will act as the forward declarator and canonical declaration for Bar
. This seems a significant enough difference from the normal case we face, which is more akin to:
struct Foo {
struct Bar *bar;
};
In this latter "normal" case, Multiplier has special logic to "hoist" that struct Bar
forward declaration into its own floating fragment (the unit of code deduplication in multiplier). It does this because deduplication is partially AST structure aware. Later, there's a fixpoint step to go and add stuff to the fragment. This is a simplified explanation, but I think the bug lives somewhere in the hoisting logic not triggering, and there being different translation units where struct Bar
is declared before struct Foo
. I'll need to investigate further.
Interesting. Glad you’re able to reproduce! The Linux kernel is great for edge cases 😂
I think I have this fixed in my branch. Along the way I noticed some things and fixed them too, and then scope creeped into adding some features lol. Hopefully by tomorrow there's be an updated main
for you to test. One important note: you will need to re-generate your databases.
Right on! I'll give that a shot as soon as it drops. You're the 💣!
You might overall want to do a fresh clone & rebuild, because I've changed a submodule (VAST) to point to a branch of the main repo, and I've bumped the PASTA submodule. It's also easy to go in and pull the latest there. If you go that route, then to rebuild, do the following within the build directory.
rm -rf vendor/install/vast/include
pushd vendor/vast/build ; ninja install ; popd
pushd vendor/pasta/build ; ninja install ; popd
ninja install
Let me know if the issue is resolved for you! If so, then I can close out this issue :-D
Currently struggling to complete a clean build at the moment. Keep getting a bunch of undefined reference errors when linking:
ld.lld: error: lib/libmultiplier.so: undefined reference to llvm::EnableABIBreakingChecks [--no-allow-shlib-undefined]
ld.lld: error: lib/libmultiplier.so: undefined reference to llvm::ilist_detail::SpecificNodeAccess<llvm::ilist_detail::node_options<mlir::Operation, true, false, void, false> >::getNodePtr(mlir::Operation*) [--no-allow-shlib-undefined]
ld.lld: error: lib/libmultiplier.so: undefined reference to llvm::ilist_detail::SpecificNodeAccess<llvm::ilist_detail::node_options<mlir::Operation, true, false, void, false> >::getValuePtr(llvm::ilist_node_impl<llvm::ilist_detail::node_options<mlir::Operation, true, false, void, false> >*) [--no-allow-shlib-undefined]
ld.lld: error: lib/libmultiplier.so: undefined reference to mlir::RewriterBase::inlineRegionBefore(mlir::Region&, mlir::Region&, llvm::ilist_iterator<llvm::ilist_detail::node_options<mlir::Block, true, false, void, false>, false, false>) [--no-allow-shlib-undefined]
ld.lld: error: lib/libmultiplier.so: undefined reference to mlir::RewriterBase::cloneRegionBefore(mlir::Region&, mlir::Region&, llvm::ilist_iterator<llvm::ilist_detail::node_options<mlir::Block, true, false, void, false>, false, false>, mlir::IRMapping&) [--no-allow-shlib-undefined]
ld.lld: error: lib/libmultiplier.so: undefined reference to mlir::RewriterBase::inlineBlockBefore(mlir::Block*, mlir::Block*, llvm::ilist_iterator<llvm::ilist_detail::node_options<mlir::Operation, true, false, void, false>, false, false>, mlir::ValueRange) [--no-allow-shlib-undefined]
ld.lld: error: lib/libmultiplier.so: undefined reference to mlir::RewriterBase::splitBlock(mlir::Block*, llvm::ilist_iterator<llvm::ilist_detail::node_options<mlir::Operation, true, false, void, false>, false, false>) [--no-allow-shlib-undefined]
ld.lld: error: lib/libmultiplier.so: undefined reference to llvm::ilist_traits<mlir::Block>::transferNodesFromList(llvm::ilist_traits<mlir::Block>&, llvm::ilist_iterator<llvm::ilist_detail::node_options<mlir::Block, true, false, void, false>, false, false>, llvm::ilist_iterator<llvm::ilist_detail::node_options<mlir::Block, true, false, void, false>, false, false>) [--no-allow-shlib-undefined]
ld.lld: error: lib/libmultiplier.so: undefined reference to llvm::ilist_traits<mlir::Operation>::transferNodesFromList(llvm::ilist_traits<mlir::Operation>&, llvm::ilist_iterator<llvm::ilist_detail::node_options<mlir::Operation, true, false, void, false>, false, false>, llvm::ilist_iterator<llvm::ilist_detail::node_options<mlir::Operation, true, false, void, false>, false, false>) [--no-allow-shlib-undefined]
ld.lld: error: lib/libmultiplier.so: undefined reference to mlir::ConversionPatternRewriter::inlineRegionBefore(mlir::Region&, mlir::Region&, llvm::ilist_iterator<llvm::ilist_detail::node_options<mlir::Block, true, false, void, false>, false, false>) [--no-allow-shlib-undefined]
ld.lld: error: lib/libmultiplier.so: undefined reference to mlir::OpBuilder::createBlock(mlir::Region*, llvm::ilist_iterator<llvm::ilist_detail::node_options<mlir::Block, true, false, void, false>, false, false>, mlir::TypeRange, llvm::ArrayRef<mlir::Location>) [--no-allow-shlib-undefined]
ld.lld: error: lib/libmultiplier.so: undefined reference to mlir::RewriterBase::cloneRegionBefore(mlir::Region&, mlir::Region&, llvm::ilist_iterator<llvm::ilist_detail::node_options<mlir::Block, true, false, void, false>, false, false>) [--no-allow-shlib-undefined]
ld.lld: error: lib/libmultiplier.so: undefined reference to mlir::ConversionPatternRewriter::splitBlock(mlir::Block*, llvm::ilist_iterator<llvm::ilist_detail::node_options<mlir::Operation, true, false, void, false>, false, false>) [--no-allow-shlib-undefined]
ld.lld: error: lib/libmultiplier.so: undefined reference to mlir::Block::splitBlock(llvm::ilist_iterator<llvm::ilist_detail::node_options<mlir::Operation, true, false, void, false>, false, false>) [--no-allow-shlib-undefined]
clang++-18: error: linker command failed with exit code 1 (use -v to see invocation)
[1580/3736] Building CXX object bin/Examples/C...iles/mx-list-functions.dir/ListFunctions.cpp.o
ninja: build stopped: subcommand failed.
This is coming after blowing away everything in build, install, and src and starting from scratch from the instructions in BUILD.md
. Not sure exactly what's going on here.
Hrmm okay there's going to be a missing library references somewhere. I will try a fresh rebuild later today and see if I can replicate this.
Retrying the build from scratch again just in case there is an environment thing going on. I had a few ssh sessions to the box I'm working on and it's possible I messed something up.
Regardless - I grabbed the Linux binary release, generated a new index from that, and ran mx-find-linked-structures
on it, and it ran to completion, so that is awesome! Will keep you posted on the build issue.
My guess is that it has to do with a debug build. I've got a new branch for it, with a minor change to the LLVM configuration. I'm doing a fresh debug build and will see if I can repro it.
I think it was an environment thing on my end - just got a debug build to succeed on Linux, and mx-find-linked-structures
from that build just ran to completion, so I think we can put a bow on this one.
Thank you again for all your support, and this excellent tool! It's on my Github watch list now, and I'll definitely be prodding at it some more as well as figuring out the Python bindings.
Running
mx-find-linked-structures
on an indexed Linux kernel, and running into an assertion error:Built a debug version and it looks like it starts from FindLinkedStructures.cpp:76 in the first phase of analysis. Happy to provide more debugging info if that would be helpful.
Backtrace:
System info
Ubuntu 22.04.4 multiplier tag:
c18052b