Open LinkTed opened 4 years ago
send_vectored_with_ancillary
should switch to non-mutable references for its parameters, since sendmsg
doesn't write through them.
https://github.com/rust-lang/rust/pull/79753 changes just the bufs
parameter, but it was closed for inactivity.
(Edit: Hmm, actually changing the ancillary data too would require both a SocketAncillary and SocketAncillaryMut, like IoSlice. Probably not worth it there)
@tadeokondrak I open a new PR #82589.
I create a new PR to send and receive TTL for UdpSocket from the ancillary interface. #82858
Hi @LinkTed, I posted a fix in https://github.com/rust-lang/rust/pull/83374 for macOS, OpenBSD, and possibly other BSDs.
What has to be done to stabilize this API?
I think the API still needs polish. All those niche socket features are tricky and right now it's probably not extensible enough for 3rd party libraries to fill platform-specific gaps such as flags or cmsgs only supported on one particular OS. These basically are wrappers for the recvmsg
and sndmsg
syscalls, we should expose them in a way that's forwards-compatible so we don't have to add recv_vectored_with_ancillary_from_with_flags
or something like that in the future.
Before stabilizing we should consider whether recv_vectored_with_ancillary_from
should gain another argument. I would recommend adding a ReceiveOptions
argument that allows one to specify flags, similar to OpenOptions
. The options could then be used to set linux' MSG_DONTWAIT
, MSG_PEEK
or MSG_ERRQUEUE
for example, maybe via custom_flags()
similar to OpenOptions
. That way we don't have to support all OS-specific peculiarities.
Or we could put the flags into SocketAncillary
or at least reserve the right to do that in the future, but in that case the struct would need to be renamed because it would serve several purposes at once.
Similar could be done for sending.
Additionally the AncillaryData
enum should be marked as #[non_exhaustive]
and maybe gain a Unknown
variant for things that the standard library doesn't support. And a way to get the raw ancillary message. That way one could for example decode struct sock_extended_err
in a 3rd party crate without waiting for std to implement it.
@the8472 Thank you for your feedback. If this PR request is #82858 merged I will start implementing your suggestions.
What about CMSG_SPACE? Without it, we're forced to either guess or probe the appropriate buffer size.
https://man7.org/linux/man-pages/man3/cmsg.3.html SCM_RIGHTS example shows how it's used to compute the correct buffer size (with some tricky business wrt alignment).
Hi @tv42,
what are the difference between CMSG_SPACE
and CMSG_LEN
?
Is there a function, which calculate the alignment for a struct to a specific address in Rust?
what are the difference between
CMSG_SPACE
andCMSG_LEN
?
CMSG_SPACE includes alignment bytes, CMSG_LEN doesn't.
Is there a function, which calculate the alignment for a struct to a specific address in Rust?
I don't know what this means.
what are the difference between
CMSG_SPACE
andCMSG_LEN
?CMSG_SPACE includes alignment bytes, CMSG_LEN doesn't.
Is there a function, which calculate the alignment for a struct to a specific address in Rust?
I don't know what this means.
I thought that the alignment depends on the address, where you want to store the struct.
What it means for CMGS_SPACE to include “alignment bytes” is that it rounds the size up to include a number of padding bytes to make sure the next header is properly aligned (to the alignment of cmsghdr).
So, the address have to be a multiple of the return value of CMSG_ALIGN
, right?
The address of the cmsghdr object has to be a multiple of std::mem::align_of::<cmsghdr>()
.
Sorry, that I ask again. Does the macro CMSG_ALIGN
the same as std::mem::align_of::<cmsghdr>()
?
CMSG_ALIGN(len)
rounds len up to a multiple of std::mem::align_of::<cmsghdr>()
. For instance, if the required alignment is 8, and len is 11, CMSG_ALIGN(len)
will return 16.
As an example, consider the following two headers:
let A = cmsghdr{ cmsg_len = 19, cmsg_level = X, cmsg_type = Y };
let B = cmsghdr{ cmsg_len = 22, cmsg_level = W, cmsg_type = Z };
If cmsg_len
were not rounded up, this sequence of headers would look like this in memory:
~~~~~~~~~A~~~~~~~~~ ~~~~~~~~~B~~~~~~~~~
+----+----+----+----+----+----+----+----+
Address: | 0 | 8 | 12 | 16 | 19 | 27 | 31 | 35 |
+----+----+----+----+----+----+----+----+
Data: | 19 | X | Y | _ | 22 | W | Z | _ |
+----+----+----+----+----+----+----+----+
Notice how header B is at improperly aligned address 19, which is not a multiple of the alignment of cmsghdr.
By rounding up the lengths (assuming an alignment of 8 for this example), we get this:
~~~~~~~~~A~~~~~~~~~ ~~~~~~~~~B~~~~~~~~~
+----+----+----+----+----+----+----+----+----+----+
Address: | 0 | 8 | 12 | 16 | 19 | 24 | 32 | 36 | 40 | 46 |
+----+----+----+----+----+----+----+----+----+----+
Data: | 19 | X | Y | _ | | 22 | W | Z | _ | |
+----+----+----+----+----+----+----+----+----+----+
~~~~ padding bytes ~~~~ padding bytes
Now, every header and each field in every header is at the aligned address it should be.
Thanks, @chloekek. Currently, the code is using CMSG_LEN
, which includes the alignment bytes. So, technically we have to ensure that the first cmsg
struct is properly aligned, right?
But, if cmsg_len
is rounded up, this means that the receiver interpreted the alignment bytes as data. For example, if the sender set 3 bytes, which then is aligned to 8, the receiver will think that he got 8 bytes. How do we handle this?
Actually, the man page says that you should include the padding in cmsg_len:
CMSG_LEN() returns the value to store in the cmsg_len member of the cmsghdr structure
This means that the receiver will indeed interpret those bytes as part of the data, I don’t think there’s any way around that?
This would mean that, if alignof(cmsghdr) = 8
, and sizeof(int) = 4
, you could only send an even number of file descriptors using SCM_RIGHTS. :sweat_smile: I guess if you want to send an odd number of file descriptors you could program the sender to set the odd one to -1
and program the receiver to ignore any -1
s.
Interesting, currently there is a test, which send one FD and I tested on a 64-bit system, where the alignment of cmsghdr
should be 8. 😀
What is suspicious is that CMSG_NXTHDR adds the alignment if necessary.
This could mean that either:
Note that while you could test this, other Unix implementations (e.g. BSD) may behave differently from Linux in this regard. It would be best to check the Official Spec™ (POSIX?).
As for file descriptors, the kernel will indeed divide the data length by the size of a file descriptor in order to compute how many file descriptors to send. And it seems that you indeed do have to set all of them to valid file descriptors, not to -1
.
I think the man page is wrong. CMSG_LEN does not include the alignment. Only CMSG_SPACE
as @comex suggested.
So to summarize:
So, if CMSG_NXTHDR
returns an aligned address, we have to ensure that the first cmsghdr
is aligned because then any subsequent cmsghdr
is aligned. Technically, we do not have to use CMSG_SPACE
.
By the way @chloekek. Here is the commit, where the CMSG_ALIGN
were removed.
The SocketAncillary::new
function currently takes any old &mut [u8]
, regardless of alignment. Do we want to enforce its alignment? It could be made safe with an assert!
ion on the address of the slice, but that would be rather annoying to use for users, as they would have to jump through hoops to make sure their buffer is properly aligned.
Yes, therefore I would add alignment into the SocketAncillary::new
function.
Rounding up the address of the slice in SocketAncillary::new
would mean that the actual length of the buffer can be up to 7 less than the length provided by the user (assuming an alignment of 8). Is this something we want? It is surprising behavior that would have to be documented well.
An alternative solution would be to provide data types for aligned buffers:
pub union SocketAncillaryBuf<const N: usize> {
// same trick as suggested in cmsg(3) man page
_align: cmsghdr,
buf: [u8; N],
}
pub struct SocketAncillarySlice<'a> {
// *buf is always aligned to cmsghdr, because we only export constructors that ensure that.
buf: &'a mut [u8],
}
impl<const N: usize> SocketAncillaryBuf<N> {
pub fn as_slice(&mut self) -> SocketAncillarySlice {
unsafe { SocketAncillarySlice::from_aligned_slice(&mut self.buf) }
}
}
impl<'a> SocketAncillarySlice<'a> {
pub unsafe fn from_aligned_slice(buf: &'a mut [u8]) -> Self {
Self{buf}
}
/// Warning: will shrink buf to fit alignment requirements if necessary.
pub fn from_unaligned_slice(buf: &'a mut [u8]) -> Self {
// adjust pointer of buf
}
}
Unfortunately, this would not work if the user wants to use a dynamically allocated buffer, like Vec<u8>
.
Rounding up the address of the slice in SocketAncillary::new would mean that the actual length of the buffer can be up to 7 less ..
Unfortunately, yes 😞
Rounding up the address of the slice in SocketAncillary::new would mean that the actual length of the buffer can be up to 7 less than the length provided by the user (assuming an alignment of 8). Is this something we want? It is surprising behavior that would have to be documented well.
Isn't this what CMSG_SPACE
is for, to know how big of a buffer you need to allocate to fit the given amount of payload.
Oh I see, this seems to about allocating at the right alignment. Yeah, you can't just secretly ignore some bytes from the beginning, because for typical use of cmsg(3)
, you pass in the exact right amount of bytes; that'd only give it e.g. 1/8th of a chance of working.
So, if CMSG_NXTHDR returns an aligned address, we have to ensure that the first cmsghdr is aligned because then any subsequent cmsghdr is aligned. Technically, we do not have to use CMSG_SPACE.
The purpose of CMSG_SPACE is to know how large a buffer to allocate. Not to assist in filling it.
I think the road forward is going to look something like
let num_fds = 7;
let inner_size = (std::mem::size_of::<libc::c_int>() * num_fds) as libc::c_uint;
let size = unsafe { libc::CMSG_SPACE(inner_size) } as usize;
let align = std::mem::size_of::<libc::cmsghdr>() as usize;
let align = align.next_power_of_two();
let layout = std::alloc::Layout::from_size_align(size, align)
.expect("alignment math for cmsghdr");
let ptr = unsafe { std::alloc::alloc(layout) };
let buf = unsafe { std::slice::from_raw_parts_mut(ptr, size) };
Wrapping that in a function in std::os::unix::net sounds like the right idea.
Maybe one for FDs, another for arbitrary data (taking just byte count, not FD count).
Err, lifetimes won't let you return a slice reference. Maybe returning Box<[u8]>
? This is me running into limits of my knowledge of Rust idioms.
Something like this maybe:
pub struct SocketAncillary<'a> {
buffer: &'a mut [u8],
length: usize,
truncated: bool,
}
impl<'a> SocketAncillary<'a> {
pub fn new(buffer: &'a mut [u8]) -> Self {
SocketAncillary {
buffer,
length: 0,
truncated: false,
}
}
pub fn with_capacity(size: usize) -> Self {
let size = unsafe { libc::CMSG_SPACE(size as u32) } as usize;
let align = std::mem::size_of::<libc::cmsghdr>() as usize;
let align = align.next_power_of_two();
let layout =
std::alloc::Layout::from_size_align(size, align).expect("alignment math for cmsghdr");
let ptr = unsafe { std::alloc::alloc(layout) };
let buffer = unsafe { std::slice::from_raw_parts_mut(ptr, size) };
Self::new(buffer)
}
pub fn with_capacity_for_fds(num: usize) -> Self {
let size = std::mem::size_of::<libc::c_int>() * num;
Self::with_capacity(size)
}
}
fn main() {
let s = SocketAncillary::with_capacity(42);
println!("{:?}", s.buffer);
}
let align = std::mem::size_of::<libc::cmsghdr>() as usize;` let align = align.next_power_of_two();
I wouldn't do this. In practice this value should be enough, because on all of the implementations I've seen (Linux, Darwin, OpenBSD, Windows), the alignment that
CMSG_ALIGN
aligns to is <= 8, while the size ofcmsghdr
is >= 12. But that's sort of a coincidence. There's no real reason to think the size ofcmsghdr
has any relationship to the required alignment, especially given that platforms other than Linux still have macros such asCMSG_SPACE
useCMSG_ALIGN(sizeof(struct cmsghdr))
(or similar) rather than justsizeof(struct cmsghdr)
. (And on OpenBSD at least this actually makes a difference, sincesizeof(struct cmsghdr)
is always 12 while the required alignment is 8 on 64-bit targets. In your proposed version,next_power_of_two()
would cover that, ending up at a requested alignment of 16, but again, that's sort of a coincidence.)
On platforms which define CMSG_ALIGN
, the required alignment should be equal to CMSG_ALIGN(0)
. Some platforms don't define it, but… since the libc
bindings are maintained by hand anyway, I suggest just inspecting the definition of CMSG_SPACE
on each platform to see what value it aligns to.
I think CMSG_ALIGN
is only useful in the following usage:
let address_of_previous_cmsghdr: *const cmsghdr = &...;
let address_of_next_cmsghdr: *const cmsghdr =
address_of_previous_cmsghdr.offset(CMSG_ALIGN(length_of_previous_cmsghdr));
And this usage is already encapsulated in the CMSG_NXTHDR
macro.
N.B. I think the correct way to read the name CMSG_ALIGN
is to think of “ALIGN” as a verb: it aligns the given value (by rounding it up). It does not return “the alignment”. I guess CMSG_ALIGN(0)
would be equivalent to that, though. :')
If you want to know the alignment for cmsghdr
, can’t you just use std::mem::align_of::<cmsghdr>()
? That should always give the correct answer. This is also what the union trick in the cmsg(3) man page implicitly relies on (i.e. it relies on rule that the alignment of a union type is the maximum alignment of its field types).
If you want to heap-allocate the buffer, then this cannot be done by SocketAncillary
itself, as SocketAncillary
borrows the buffer; it does not own the buffer and so must not hold on to heap-allocated memory, as it would leak.
So if we want to offer an API for aligned buffers, it should be a separate structure:
pub struct SocketAncillaryBoxedBuf {
// INVARIANT: Address is aligned to `align_of::<cmsghdr>()`.
inner: Box<[u8]>,
}
impl SocketAncillaryBoxedBuf {
pub fn with_capacity(size: usize) -> Self { /* aligned alloc here */ }
}
And replace SocketAncillary::new
with:
impl<'a> SocketAncillary<'a> {
pub fn from_boxed_buf(buffer: &mut SocketAncillaryBoxedBuf) -> Self { ... }
}
let mut example = SocketAncillaryBoxedBuf::with_capacity(12); // owns the memory
let mut ancillary = SocketAncillary::from_boxed_buf(&mut example); // borrows the memory
Then it would also be easy to add an interface for stack-allocated buffers:
pub union SocketAncillaryArrayBuf<const N: usize> {
_align: cmsghdr,
inner: [MaybeUninit<u8>; N],
}
impl<'a> SocketAncillary<'a> {
pub fn from_array_buf<const N: usize>(buffer: &mut SocketAncillaryArrayBuf<N>) -> Self { ... }
}
let mut example = SocketAncillaryArrayBuf<8>::new();
let mut ancillary = SocketAncillary::from_array_buf(&mut example);
This is getting quite complex for a simple feature. :sweat_smile: Perhaps there should be some more generic aligned buffer facility (a la C++’s std::aligned_storage), but that would probably be part of a whole other RFC.
Come to think of it, can’t we just take a [MaybeUninit<cmsghdr>]
slice instead of a [u8]
slice? pub fn new(buffer: &'a mut [MaybeUninit<cmsghdr>]) -> Self
. It will always be aligned properly, and the use of MaybeUninit
ensures that the user can’t just treat it as proper cmsghdr data (as it may be the payload data, or just not yet initialized), and it is easy to allocate both on the stack and on the heap.
And how can the user choose the length of the buffer (as bytes)?
If you want to know the alignment for
cmsghdr
, can’t you just usestd::mem::align_of::<cmsghdr>()
? That should always give the correct answer. This is also what the union trick in the cmsg(3) man page implicitly relies on (i.e. it relies on rule that the alignment of a union type is the maximum alignment of its field types).
I am pretty sure the trick suggested in the Linux man page is actually wrong – at least morally speaking. The alignment of cmsghdr
is not necessarily sufficient. For example, on OpenBSD (and FreeBSD, which I just verified does the same as OpenBSD here), cmsghdr
is defined as
struct cmsghdr {
socklen_t cmsg_len;
int cmsg_level;
int cmsg_type;
};
socklen_t
is defined as uint32_t
, so struct cmsghdr
always has a size of 12 and alignment of 4.
However, the CMSG_SPACE
, CMSG_LEN
, and CMSG_ALIGN
macros all use _ALIGN
:
#define CMSG_SPACE(l) (_ALIGN(sizeof(struct cmsghdr)) + _ALIGN(l))
#define CMSG_LEN(l) (_ALIGN(sizeof(struct cmsghdr)) + (l))
#define CMSG_ALIGN(n) _ALIGN(n)
and at least on x86-64, _ALIGN
rounds to 8 bytes:
typedef __int64_t __register_t;
#define _ALIGNBYTES (sizeof(__register_t) - 1)
#define _ALIGN(p) (((__uintptr_t)(p) + _ALIGNBYTES) & ~_ALIGNBYTES)
Now, technically, the value that the sizes are aligned to doesn't have to be the same as the value the pointers are aligned to. In practice, I believe (though haven't verified) that neither the BSDs nor Linux care about pointer alignment for cmsgs at all. On all of those systems, recvmsg
, sendmsg
, etc. are implemented as system calls. And on all of those systems, with cmsgs, like in most situations where userland passes pointers to the kernel, the kernel doesn't access the userland memory directly, but allocates its own buffer and effectively memcpys the data from/to userland as the first/last step. And the kernel's allocated buffer has whatever alignment the kernel allocates it with, unrelated to the userland buffer's alignment.
But in theory those functions could be implemented in such a way that alignment matters, e.g. purely in userland, or in the kernel but with some mechanism where the kernel directly maps userland pages. And while this is not explicitly stated, I would assume the pointer should be aligned to the same value the sizes are, because the whole point of aligning the sizes is to ensure that pointers stay aligned.
Unfortunately, there's no spec language to check this assumption against. The POSIX man page for sys/socket.h describes cmsghdr
and some of the CMSG_* macros (namely, CMSG_DATA
, CMSG_NXTHDR
, and CMSG_FIRSTHDR
), but it doesn't even mention CMSG_SPACE
and CMSG_LEN
, and it does not mention alignment at all. Nor are those mentioned in the POSIX man pages for recvmsg
and sendmsg
.
Even if the function recvmsg
and sendmsg
is implemented in such a way that alignment does not matter, should be the cmsghdr
be aligned, so that the user can access it? For example, to read the FDs.
To be honest, I do not see any big issue if the user allocates BUFFER_SIZE + 7
bytes to ensure that there are enough bytes.
Another concern: Nothing seems to allow access to recvmsg(2)
argument flags
, or the similar return side msghdr.msg_flags
.
Of these, specially interesting are
flags
MSG_CMSG_CLOEXEC
prevents races with fork(2)
when passing FDsmsghdr.msg_flags
MSG_TRUNC
is very useful for correct error handling; otherwise you have to resort to kludges like using a buffer of size max_message_size+1 to tell the difference.flags
MSG_TRUNC
makes debugging truncated messages easier, as you can see what size was sentAh, I was reading https://doc.rust-lang.org/stable/std/os/unix/net/struct.UnixStream.html#method.recv_vectored_with_ancillary and https://doc.rust-lang.org/nightly/std/os/unix/net/struct.UnixStream.html#method.recv_vectored_with_ancillary which are just -> Result<usize>
. The _from
variant is only on UnixDatagram. What I'm actually using is SOCK_SEQPACKET, because I do want a connection.
Is the SOCK_SEQPACKET
supported by Rust's std, currently?
No, there was #50348 but it was recommended to implement it in an external crate first.
That said if https://github.com/rust-lang/rust/issues/76915#issuecomment-820682675 is addressed then I think one could make one of the unix sockets dual-purpose (only the method of construction would differ) since the EOR flag could be accessed through the extended API.
The semantics of SOCK_SEQPACKET depend on the domain. For UNIX domain sockets, they connect like SOCK_STREAM and send/receive like a (pseudo-connected) SOCK_DATAGRAM. If you FromRawFd an existing SOCK_SEQPACKET fd into a UnixDatagram, it works just fine.
Hi. I have some remarks on the API.
std::os::unix::net::SocketAncillary
read from and write to a buffer passed to its method new
, and this buffer is external to SocketAncillary
. It would be nice to have a method of calculating the size of this buffer in the API. As I understand, the len
method is not useful here. Since the buffer isn't extensible, the caller needs to calculate its size before writing any data into it. In C, the buffer size is calculated as the sum of CMSG_SPACE(size(i))
for all control messages i
. size(i)
is the size of the payload of the control message i
. size(i) == N*sizeof(int)
if cmsg_type
of the control message equals SCM_RIGHTS
and the control message contains N
file descriptors.add_fds
method adds exactly one control message? It would be nice to clarify that in the API documentation. The same question for the add_creds
method.Should the buffer passed as the argument to the new
method be aligned? If yes, how can a caller allocate such a buffer?
It seems that the add_to_ancillary_data
function writes a control message over an existing one in the buffer pointed by the msg_control
field of struct msghdr
. This is not right. This buffer can contain several control messages.
I believe that using the CMSG_NXTHDR
macro when writing ancillary data is buggy on Linux. See my question on StackOverflow.
Here is my adaptation of a diagram by Stevens et al. (396) which shows the layout of the buffer pointer by the msg_control
field of struct msghdr
.
Stevens, Richard W., et al. UNIX Network Programming: The Sockets Networking API. 3rd ed., vol. 1, Addison-Wesley, 2004.
* As far as I understand, the methods of `std::os::unix::net::SocketAncillary` read from and write to a buffer passed to its method `new`, and this buffer is external to `SocketAncillary`. It would be nice to have a method of calculating the size of this buffer in the API. As I understand, the `len` method is not useful here. Since the buffer isn't extensible, the caller needs to calculate its size before writing any data into it. In C, the buffer size is calculated as the sum of `CMSG_SPACE(size(i))` for all control messages `i`. `size(i)` is the size of the payload of the control message `i`. `size(i) == N*sizeof(int)` if `cmsg_type` of the control message equals `SCM_RIGHTS` and the control message contains `N` file descriptors.
We can add such a function. Do you have any thought how the function should be named?
* Am I correct that the `add_fds` method adds exactly one control message? It would be nice to clarify that in the API documentation. The same question for the `add_creds` method.
Yes, you are correct that
add_fds
adds a control message. Can you make a suggestion, what should be added to the documentation?
Tracking issue for
feature(unix_socket_ancillary_data)
to extendUnixStream
andUnixDatagram
to send and receive ancillary data. For example file descriptors.Unresolved Questions
SocketAncillary
struct before stabilization to ensure it is compatible with all OSes.fuchsia
,haiku
,illumos
,ios
,macos
andsolaris
does not haveMSG_CMSG_CLOEXEC
constant inlibc
fuchsia
anduclibc
(x86_64
) does not havecmsghdr
struct inlibc
libc
does not haveucred
struct for the target OSemscripten
, butemscripten
has this struct in the standard C libraryImplementation history
PR #69864, #82589, #83374