Open qinsoon opened 7 years ago
Yes, I think the issue is that in C, the alignment is done for all the struct after union elements are inserted, then in C we do not align the union itself. I do not think we can do it in Rust.
We can force the union to be aligned in 4bytes by only using 4bytes fields or less, then for example *mut c_void
become [u32; 2]
for 64 bits architectures.
It is passing the test now, but it's not a clean solution. We have then to define the size of these arrays conditionally in function of the target while the types which can be longer than 4 bytes are already defined conditionnaly in libc.(when const fn
will stabilize, we could use [u8; std::mem::size_of::<*mut c_void>()]
though)
The other bad point is that users of libc
would be forced to transmute
these fields to access them as their correct type.
Maybe the solution would be to let this as is, and let higher-level crate as nix
provides a safe interface for the padding bytes ?
Thanks for the reply. I was a bit surprised as I thought with the introduction of union
types, it is always possible to declare identical types in Rust as in C. I don't see there is a neat way to deal with this case. Providing accessor to the padding bytes always adds overhead (though it doesn't matter in this case - signal handling).
Also nix
is totally ignoring this part as well. As for extracting info from siginfo_t
and ucontext_t
, it seems more convenient to just write the code in C and call form Rust (which is as efficient, but is a bit awkward for a Rust project).
Yeah right now the handling of siginfo_t
upstream in rust-lang/rust is pretty nasty. Until we get unions this unfortunately isn't really gonna get much better.
@qinsoon Marking the si_fields_t
union as #[repr(C, packed)]
instead of just #[repr(C)]
makes it the same size as the original one. Is this correct / does this help?
Anyone interested on working on this anytime soon? It appears that @vojtechkral suggest a way to solve this. I can mentor.
Should this be in the 1.0 roadmap? #547
I'd be interested in working on this, at least far enough to get si_addr
:-) Would a patch that just added enough to read that be acceptable, or do you want this to solve this completely?
@alex what do you mean by not solving this completely? the proposed change should just work, it will only need a bit of cfg-magic to be active on newer Rust versions.
Not adding every single field :-)
As long as the struct layout is correct, if the union field is private, we can do whatever we want (as long as we only use the union on toolchains that support it).
If the question is whether we can make the field a public union, and then add newer union fields incrementally over time, I haven't thought about that, but I think it's a good question. I've asked it here: https://github.com/rust-lang-nursery/api-guidelines/issues/196
Ah yes, that does suggest a good question: do we want to just expose the underlying unions, or add methods on siginfo_t
which access the underlying field?
If C exposes the union we should do that. If C has functions to expose the fields, we can add those. I don't think C has methods, so doing that would be something more appropriate for the nix
crate.
In C the field containing a union has a name with an underscore
_sifields
, and then there are macros like #define si_addr _sifields._sigfault._addr
. So you can do some_siginfo_t.si_addr
(and
you can't really have a field named si_addr
on any other struct, but oh
well).
On Fri, Apr 12, 2019 at 8:35 AM gnzlbg notifications@github.com wrote:
If C exposes the union we should do that. If C has functions to expose the fields, we can add those. I don't think C has methods, so doing that would be something more appropriate for the nix crate.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/rust-lang/libc/issues/716#issuecomment-482557924, or mute the thread https://github.com/notifications/unsubscribe-auth/AAADBD10ePsPypLCiLRuXizwOZMTgjdXks5vgH2RgaJpZM4OxrDb .
-- All that is necessary for evil to succeed is for good people to do nothing.
We typically map those macros to Rust functions, so I think we can just keep the field private (in which case it does not matter whether it is implemented using an union
or not), and just provide functions to access the fields. We can probably make things a bit more concrete once we have a PR open with an initial proposal.
I'll try to get to it later today!
On Fri, Apr 12, 2019, 9:07 AM gnzlbg notifications@github.com wrote:
We typically map those macros to Rust functions, so I think we can just keep the field private (in which case it does not matter whether it is implemented using an union or not), and just provide functions to access the fields. We can probably make things a bit more concrete once we have a PR open with an initial proposal.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/rust-lang/libc/issues/716#issuecomment-482567616, or mute the thread https://github.com/notifications/unsubscribe-auth/AAADBFnxwr7GrtUiL8FEVuVijnjFOtApks5vgIUegaJpZM4OxrDb .
So the padding field of siginfo_t
was not private as it should have been. This means that using an union
instead would be a hard breaking change. Once we have a PR that implements the proposed change, we can try and see if there is any code out there actually relying on that, but this might be something that might need to wait until the next breaking release.
Figured I'd drop by and leave my 2 cents here as well since I'm looking into siginfo_t
for a project.
The kernel's siginfo_t
is NOT packed, and should not be declared as such in Rust (re: vojtechkral's comment). The siginfo_t
structure looks like this in memory (x86_64
) when there's a process that was/needs to be waited on:
|11000000 00000000 01000000 00000000| ................ 00000000
|d2620000 e8030000 00000000 00000000| .b.............. 00000010
|00000000 00000000 00000000 00000000| ................ 00000020
|00000000 00000000 00000000 00000000| ................ 00000030
|00000000 00000000 00000000 00000000| ................ 00000040
|00000000 00000000 00000000 00000000| ................ 00000050
|00000000 00000000 00000000 00000000| ................ 00000060
|00000000 00000000 00000000 00000000| ................ 00000070
The first 3 4-byte values are the fields si_signo
, si_errno
, and si_code
. The 4th 4-byte field (all zeros) is padding. This padding is caused by the _sifields
union having a member that needs to be 8-byte aligned on 64-bit systems (_sigfault
and _sigsys
union fields are the obvious culprits here, as the first members in those structs are pointers).
It looks like rust's union implementation needs to be fixed to inherit alignment requirements from its members, which doesn't sounds particularly difficult - just caclulate the required alignment for each, and use an alignment that satisfies all of them (which will usually just be the max alignment present). The Edit: leaving this for posterity, but it's not correct - I was having some alignment issues of my own. After fixing those, the union alignment was working correctly.packed
attribute would just over-rule that alignment decision and append the union directly after the previous field.
In the rust playground link, the forced 8-byte alignment throws off the padding calculation (which is actually garbage pre linux 4.20). It doesn't take into account that the compiler will add padding between the 3 integers and the union on 64 bit systems, which makes the padding 4 bytes too long. This isn't an issue in libc right now because those fields in the union are omitted.
Recent versions of the linux kernel (since 4.20) have changed the definition of siginfo_t
to be less confusing. It's now effectively:
struct siginfo_t {
union {
int padding[128 / sizeof(int)]; // int padding[32]
struct {
int si_signo;
int si_errno;
int si_code;
union __sifields _sifields;
}; // anonymous union field
};
};
Which is much clearer (despite that it could have been better described as a union at the top level...). This removes the bizarre pad: [libc::c_int; 29]
, and in general helps to restructure something that was rather bizarre.
So it would probably be better just to update the siginfo_t
definition in rust-lang/libc
to reflect the new definition in the kernel. This could be done in another module to prevent breaking existing code that relies on the current siginfo_t
, so long as there's no issue with having two definitions present (fracturing generally isn't good, but if you want to not introduce a breaking change it may be worth it).
@nbsdx Thanks for investigating. FWIW I have no memory of that comment I wrote and, more specifically, why I thought so :)
@vojtechkral no problem, packing it was my initial solution as well until I dug into it more once I wasn't able to access fields properly :)
The definition of
siginfo_t
in Linux (https://github.com/torvalds/linux/blob/master/include/uapi/asm-generic/siginfo.h) uses a inlinedunion
type as its last field. Rust did not have support for union types, I guess that was why those fields were left as a padding array in currentlibc
implementation.I made some attempts to use
union
(introduced in Rust 1.19) to representsiginfo_t
. However the resulting struct is 8 bytes larger than the original one (the one currently in libc, which is the same size as the C struct type). See the code here: https://play.rust-lang.org/?gist=99630206fcc8fd44b452d22552b37793&version=stableMy guess is the 8 more bytes come from two places:
siginfo_t
are 12 bytes (3x 4-bytes integer), so the padding is needed after the first 3 fields, which makes the offset of the union type at 16 bytes (instead of 12 bytes)I am not sure why the C definition does not have issues like this (probably because it uses inlined
union
in the struct definition) and how to properly implementsiginfo_t
in Rust. Please correct me if I am wrong. For my project, I have to extract data from the padding array (which looks awful).