Open Abyss-W4tcher opened 5 months ago
Hello, has anyone had a chance to look into a solution ?
Unfortunately, all ISFs generated after Linux kernel 6.5 are currently invalid. :/
Anyone here got any progress on this? If changes need making to the main symbol table format, that's possible but I don't fully understand what these new structures are or how they relate yet, so hopefully someone can give me a run down so we can figure out a way to sort them appropriately...
The Ubuntu (Linux) kernel includes Rust bindings for existing C APIs. It is possible to check them by looking at a sample source code : https://bugs.launchpad.net/~canonical-kernel-team/+archive/ubuntu/ppa/+build/26995753/+files/linux-lib-rust-6.5.0-14-generic_6.5.0-14.14_amd64.deb.
Related to this issue, we can check out the fs_struct
binding (usr/src/linux-lib-rust-6.5.0-14-generic/rust/bindings/bindings_generated.rs
) :
#[repr(C)]
#[derive(Copy, Clone)]
pub struct fs_struct {
_unused: [u8; 0],
}
The problem is that we now have two fs_struct
structs inside the vmlinux
DWARF information. However, one is the "classical" C struct and the other one a Rust binding. My guess is that dwarf2json processes everything directly, instead of iterating over DW_TAG_compile_unit
(check the first comment of this issue).
To avoid breaking completely the existing ISF format, we could prefix every extracted rust binding/data with something like rust.
, resulting in :
edit : There might be confusions with cross references, so not a relevant idea (except if handled correctly ?). Maybe storing all rust content inside additional keys might be required (rust_symbols
, rust_types
...), but this also breaks with Volatility.
That seems reasonable if it becomes a unique namespace (which it sounds like rust.
or <language>.
would. Anyone any idea how much effort will that be to add to dwarf2json?
Sorry for the delay on this, I was able to discuss this with the dwarf2json maintainers.
This issue and discussion has been about the conflict between Rust and C types. However, we believe that a conflict between Rust and C symbols is also possible. We think modifying the current schema is probably the best way to avoid these collisions between Rust and C types instead of adding a prefix to the type names. For example, the new top-level schema could look close to this:
{
metadata: {},
base_types: {},
base_types_rust: {},
user_types: {},
user_types_rust: {},
}
Can this new schema work with volatility3? or will changes need to be made there as well?
Separating the user types and the base types should be straight forward, but separating C symbols and Rust symbols will be more complex. This is because symbols can come from different sources like system.map, DWARF, and the symbol table and whether Rust and C symbols will collide depends on the input source.
I'm currently looking into addressing this, but it will take some time. In the meantime, a solution could be to skip rust compilation units all together to avoid the collision and then add them back after deciding on a solution.
Hello, looking at a sample System.map, there is no way to tell with precision from which compile unit a symbol originates. Even if some of them are conveniently prefixed with rust_
:
ffffffff818095c0 T rust_fmt_argument
Many cannot be determined precisely :
ffffffff81809570 T _RNvXs0_NvNtNtCsbwHtcUjRN57_6kernel4sync7condvar1__NtB7_7CondVarNtNtNtBb_4init10___internal10HasPinData10___pin_data
Those are exported explicitely in the Ubuntu rust bindings :
EXPORT_SYMBOL_RUST_GPL(rust_fmt_argument);
EXPORT_SYMBOL_RUST_GPL(_RNvXs0_NvNtNtCsbwHtcUjRN57_6kernel4sync7condvar1__NtB7_7CondVarNtNtNtBb_4init10___internal10HasPinData10___pin_data);
However, when exported through EXPORT_SYMBOL_RUST_GPL
, I noticed that these "rust" symbols were labeled under the "GNU C11" compile unit in the vmlinux, so in the same pool as regular C symbols. So, in fact, the symbols in System.map
aren't designed to be "language" labeled by nature.
FYI, PR https://github.com/volatilityfoundation/dwarf2json/pull/65 makes use of namespace prefixes, which allows to keep the existing schema while resolving conflicts and separating types and symbols. Of course, it is open for reviews :) .
edit : Even without Rust support, there are some symbols existing multiple times in the same System.map/symbols list (see https://patchwork.kernel.org/project/linux-kbuild/patch/20230714150326.1152359-1-alessandro.carminati@gmail.com/). It can be checked out with awk '{print $3}' System.map | sort | uniq -d
.
I opened a PR that will be a short-term fix for this problem and will unblock existing plugins. I will keep this issue open to continue discussing how Rust types/symbols should be integrated and how that will affect the current schema.
Hi,
while investigating #57, I noticed the issue started appearing around the integration of Rust in the Linux kernel. With a bit more debugging, I was able to confirm that some bindings were being processed by dwarf2json in the same pool as C structs names :
Should these bindings, or wider all rust content, be processed separately from the regular structures ? I don't think they should be discarded, but maybe stored under a different parent key in the ISF ?