Closed Orycterope closed 2 years ago
because
Option<unsafe extern "C" fn(...)>
is properly optimized to be the same size as the function pointer
This is correct, and is guaranteed by rustc. Here's one instance of the documentation guaranteeing it:
The most common type that takes advantage of the nullable pointer optimization is
Option<T>
, whereNone
corresponds tonull
. SoOption<extern "C" fn(c_int) -> c_int>
is a correct way to represent a nullable function pointer using the C ABI (corresponding to the C typeint (*)(int)
).
I am very confident we're looking at a rustc compiler bug.
The issue is this typedef, in yara-sys/bindings/yara-4.1.2-windows.rs
:
pub type size_t = ::std::os::raw::c_ulonglong;
This typedef is valid on x64, but not on x86, where size_t is 4 bytes.
The issue is that, with the bundled-4_1_2
, we are using pre-generated bindings that have been generated on x64, which are not valid on x86.
This actually only impacts the windows version. On linux, it is bound to long
, which is ok on 32 and 64 bits.
So right now, the bundled-4_1_2
is not safe to use when compiling for 32-bits.
A possible solution (should be safe afaict) is that those bindings should depend on the target used. If no pre-generated bindings exist for a given target, the build.rs should fail.
There is also the size_t_is_usize flag that could be used: https://docs.rs/bindgen/0.59.1/bindgen/struct.Builder.html#method.size_t_is_usize But i'm not sure it is safe to reuse a binding file generated for a target with another target like we can right now.
I think the bundled pre-generated bindings should really be target-specific.
Also, they are meant as an easy way for a developer to quickly start working with Yara without installing the headers, but IMHO, generating the bindings on the go should be the preferred way.
@Hugal31 I can make PR with pre-generated bindigs for target. I think, is better after merge: https://github.com/Hugal31/yara-rust/pull/55
I think the bundled pre-generated bindings should really be target-specific.
Also, they are meant as an easy way for a developer to quickly start working with Yara without installing the headers, but IMHO, generating the bindings on the go should be the preferred way.
@Hugal31 I can add generate bindings for target here: https://github.com/Hugal31/yara-rust/pull/78. What do you think?
@Hugal31 I can add generate bindings for target here: #78. What do you think?
Yes, that sounds good better than nothing. Once we have a binding generated for a few targets, we can restrain the use of those bindings to those specific targets and avoid UB like this one.
@Hugal31 @vthib @Orycterope @roblabla I create PR: https://github.com/Hugal31/yara-rust/pull/80 with per-target bindings (I get targets here: https://doc.rust-lang.org/nightly/rustc/platform-support.html#tier-1-with-host-tools)
When I'm compiling for windows x86, the following trivial test fails:
On the line corresponding to
rules_scan_mem().unwrap()
. Error 53 corresponds toERROR_CALLBACK_REQUIRED
.In yara-rust, the call boils down to this code in internals::scan :
Some(scan_callback)
is the function pointer to the callback, andp_callback
is the user data that will be passed by libyara as an argument toscan_callback
. (It's a pointer the actual callback closure that will be reconstructed by ourscan_callback
and then called, but this is not relevant here).The thing is:
Some(scan_callback)
is definitely not a function pointer, and does not have the same size as a function pointer.Here are the arguments that rust pushes on the stack before doing the FFI call:
This is one stack slot more than expected by the C function
yr_rules_scan_mem
, because of the space needed for the Option's variant tag. Because of this,yr_rules_scan_mem
sees all variables shifted by one slot:callback is now 0x0 (the tag for variant Some), user_data is now callback, and timeout amounts to the address of user_data as seconds.
Officially,
yr_rules_scan_mem
is defined in rules.h aswhere
YR_CALLBACK_FUNC
isDefinitely no Option here. However, bindgen (I'm using the bundled bindings) has generated:
No idea where this Option comes from.
Somehow this bug is miraculously not triggered in x64, either because the ABI is different (no stack slots, args are passed via registers), or because
Option<unsafe extern "C" fn(...)>
is properly optimized to be the same size as the function pointer, just likeOption<NonNull<T>>
is.For now my conclusion is that bindgen is at fault here. I'll try to pinpoint the reason, and when I find it I'll open an issue on their repo linking to this one.