Open Thomasdezeeuw opened 4 years ago
cc @sfackler, @withoutboats
The main API question I have (which is more about IoSlice::advance than this I guess) is if we're okay saying that the content of the bufs
slice is unspecified after this call returns. It's probably fine, but just one "weird" affect that you may not naively expect.
@sfackler I agree, but I don't know how an alternative that takes &[IoSlice<'_>]
, like you would expect, would work.
It could for example perform vectored writes when the current cursor position is at a slice boundry and switch to a normal write_all if it's in the middle of one of the slices.
But then you're missing out on the efficiency of vectored writes, if that were the case I wouldn't consider using it.
You are only missing out on the efficienty of vectored writes some of the time.
True, but with the implementation I provided (which I still admit isn't great) you don't miss out. I've made the point already in https://github.com/rust-lang/futures-rs/pull/1741#discussion_r398715869, but personally I don't its really a big deal that the provided slice is modified. In most cases I imagine the bufs
slice being build just before this call and only used in this call.
One possibility would be to keep a fixed-size buffer of, say, [IoSlice<'_>; 128]
and copy sections of the caller's slice into that, pairing it with an index for the last-most unwritten IoSlice
. It's more complicated in that you have to re-load that IoSlice
at some clever point in order to both minimize shifting but also maximize the number of IoSlice
s being offered to the underlying reader, but it prevents mutation of the original caller's slice while also allowing some fixed amount of vectored writes.
I'm not necessarily advocating for this idea, as I think it might be fine to mutate the caller's slice (they can always make a clone if they don't want to deal with that), but it's an option.
@LukasKalbertodt I've changed the issue to be the tracking issue, could you apply the correct labels?
I'm very happy to see this method added, thank you!
I regularly find myself wanting this method, when writing various different pieces of data into a buffer and not wanting to copy them around excessively.
Regarding the API:
Normally, you can't split a writev
call without creating a potential semantic difference. However, if writev
returns early and hasn't written all the data, then the write already won't be atomic, so there's no requirement to keep the remaining iovecs together.
With that in mind, why don't we implement write_all_vectored
to take a non-mutable &[IoSlice]
, and then if write_vectored
does a partial write, we use write_all
to write the rest of the current partial IoSlice
if any, then resume calling write_vectored
with the remaining subslice of the caller's &[IoSlice]
values? That doesn't require any allocation or copying at all.
@joshtriplett - yep, that approach was discussed up above. The thought for the current implementation is that the caller isn't going to care about the contents of the buffers slice after the call completes, and mutating it minimizes the number of write calls.
If the calls are returning early, there may not be as much value in minimizing the number of calls.
Also, could we restore the slice afterwards, even if we mutate it as we go? For each call to write_vectored
, at most one change to the slice needs to occur, so we could easily change it back after that call. That way, we're requiring exclusive access to the slice but giving it back.
If the calls are returning early, there may not be as much value in minimizing the number of calls.
I don't quite understand this statement, the whole point of using vectored I/O is minimizing the number of system calls.
Also, could we restore the slice afterwards, even if we mutate it as we go? For each call to
write_vectored
, at most one change to the slice needs to occur, so we could easily change it back after that call. That way, we're requiring exclusive access to the slice but giving it back.
I guess we could but are there many cases in which you need to use the buffers after having written them all? I've personally never encountered that.
We could also create an IoSlices
type that wraps &mut [IoSlice<'_>]
and take ownership of that to make the ownership part of the API clearer.
On Fri, Apr 17, 2020 at 01:12:55AM -0700, Thomas de Zeeuw wrote:
If the calls are returning early, there may not be as much value in minimizing the number of calls.
I don't quite understand this statement, the whole point of using vectored I/O is minimizing the number of system calls.
Under normal circumstances, I'd expect the kernel to process as much of
a writev
call as possible. What I'm suggesting is that if writev
returns early, such that you have to re-call it in order to finish
writing, then making an additional write
call before doing so does not
seem likely to cause a problem.
If someone is trying to optimize the number of syscalls there, they
would want to start by figuring out why the kernel returned early from
writev
in the first place.
Also, could we restore the slice afterwards, even if we mutate it as we go? For each call to
write_vectored
, at most one change to the slice needs to occur, so we could easily change it back after that call. That way, we're requiring exclusive access to the slice but giving it back.I guess we could but are there many cases in which you need to use the buffers after having written them all? I've personally never encountered that.
If you want to reuse the same buffers, it may also make sense to reuse the [IoSlice].
We could also create an
IoSlices
type that wraps&mut [IoSlice<'_>]
and take ownership of that to make the ownership part of the API clearer.
That would improve on the current API. Or we could accept a Cow
, and
only make a copy if we need to mutate it.
Under normal circumstances, I'd expect the kernel to process as much of a
writev
call as possible. What I'm suggesting is that ifwritev
returns early, such that you have to re-call it in order to finish writing, then making an additionalwrite
call before doing so does not seem likely to cause a problem.If someone is trying to optimize the number of syscalls there, they would want to start by figuring out why the kernel returned early from
writev
in the first place.
That can have many causes that change from OS to OS and even from OS version to version, I don't think its worth it to determine the cause for each. But we should still keep the goal of minimising the number of system calls, using write
followed by more writev
calls goes against this goal.
If you want to reuse the same buffers, it may also make sense to reuse the [IoSlice].
The underlying buffers can be reused, so all you'll need to do is IoSlice::new
which is effectively a copy of the pointer and length. I don't think that is too much overhead.
That would improve on the current API. Or we could accept a
Cow
, and only make a copy if we need to mutate it.
Maybe, but what would the ToOwned
impl for IoSlice
be? For a slice its converted into a Vec
, which allocates, something we don't want for this API.
On Tue, Apr 21, 2020 at 02:22:57AM -0700, Thomas de Zeeuw wrote:
If you want to reuse the same buffers, it may also make sense to reuse the [IoSlice].
The underlying buffers can be reused, so all you'll need to do is
IoSlice::new
which is effectively a copy of the pointer and length. I don't think that is too much overhead.
Consider the case where you have a long [IoSlice] because you're stitching together a buffer from many disparate pieces.
That would improve on the current API. Or we could accept a
Cow
, and only make a copy if we need to mutate it.Maybe, but what would the
ToOwned
impl forIoSlice
be? For a slice its converted into aVec
, which allocates, something we don't want for this API.
Good point. Adding IoSlices to represent an owned [IoSlice] seems like a good step, then, if we decide to consume the IoSlice.
Hey y'all: Just noticed an issue with write_all_vectored
as I was playing around with it. The documentation says:
If the buffer contains no data, this will never call write_vectored.
Looking at the implementation, that's not what happens. In particular, if bufs
is nonempty, but only contains empty IoSlice
s, write_all_vectored
not only will call write_vectored
, but it will return an error even when the call succeeded.
fn write_all_vectored(&mut self, mut bufs: &mut [IoSlice<'_>]) -> Result<()> {
while !bufs.is_empty() {
match self.write_vectored(bufs) {
Ok(0) => {
return Err(Error::new(ErrorKind::WriteZero, "failed to write whole buffer"));
}
//...
@ezrosent @tmiasko fixed it in #74088.
@Thomasdezeeuw is it possible to also implement a similar feature for reading?
Like read_exact_vectored()
which would try to fill &mut [IoMutSlice]
until it encounters an EOF. And if there is not enough data, it would return UnexpectedEof
.
@q2p I guess so, but let's not track that in this issue. You can create a pr for it.
It seems like the biggest blocker for this is the open question about the API. The current API requires the caller to pass a &mut
, so that write_all_vectored
can modify it in place if it needs to make an additional writev call. This seems to have been proposed for performance reasons, to minimize the work required to make subsequent calls.
I would like to propose changing this, for a couple of reasons:
Meanwhile, requiring a writable slice of iovecs forces the caller to reconstruct their iovecs for every call, rather than reusing them for multiple calls (possibly with some modifications, but not rewriting them from scratch). This pessimizes the fast path in favor of optimizing the slower path.
I would propose changing this to an immutable reference, instead. If the initial write comes back partial, write_all_vectored
can either copy the iovecs into a small stack buffer, or (in the uncommon case of a huge slice of iovecs) it can do a separate write
syscall for the initial buffer.
This would both optimize the common case, and make all cases easier to invoke.
We discussed this in the @rust-lang/libs-api meeting today. Some others felt that this was a reasonable approach to simplify the API; others didn't feel strongly either way; nobody felt that we should keep the current &mut
API.
In practice, based on many codebases I've seen, I expect the common case to have 2-3 iovecs, not hundreds or thousands. People primarily seem to use this for cases like writing from a ringbuffer and handling wraparound (using two iovecs), or writing a header and body.
I am not saying that it is a common pattern, but in a ecological simulation I am using a single call to write_all_vectored
with up to 10^5 IoSlice
items to write out the simulation state as a single copy into the page buffer, i.e. without assembling a separate serialized representation first. (Since the indirections I need to follow for this might have been re-allocated when the next state dump comes, I can not reuse the content of the Vec<IoSlice>
.) (This is almost surely UB writing padding to disk and such, so probably another argument for not supporting such patterns.)
I would propose changing this to an immutable reference, instead. If the initial write comes back partial, write_all_vectored can either copy the iovecs into a small stack buffer, or (in the uncommon case of a huge slice of iovecs) it can do a separate write syscall for the initial buffer.
Agreeing that the above situation is not common, I still wonder if with the platform-specific knowledge embedded in libstd, that temporary buffer could be made sufficiently large to "saturate" the system call interface, e.g. contain 1024 IoSlice
elements on Linux, so that the number of system calls would not increase in the slow path even though the small overhead of copying the IoSlice
s is taken. But I am not sure at which point the possibly repeated copying would be worse than doing an additional system call.
It seems like the biggest blocker for this is the open question about the API. The current API requires the caller to pass a
&mut
, so thatwrite_all_vectored
can modify it in place if it needs to make an additional writev call. This seems to have been proposed for performance reasons, to minimize the work required to make subsequent calls.
Note that I'm the one who added that question and after working with for a while it's not that big a deal. It's a little different then what you might expect, e.g. compare to the write_vectored
call, but it's very usable.
I would like to propose changing this, for a couple of reasons:
* In practice, based on many codebases I've seen, I expect the common case to have 2-3 iovecs, not hundreds or thousands. People primarily seem to use this for cases like writing from a ringbuffer and handling wraparound (using two iovecs), or writing a header and body.
I don't think that is true. If I remember correctly the limit is 1024, and I've been near it before so I don't think we should optimise of for the 2-3 iovecs use case too much. Of course all this is just anecdotal.
* In the slow path of having to make an additional syscall due to a partial initial write, the cost of a syscall will vastly eclipse the cost of copying a few iovecs.
Meanwhile, requiring a writable slice of iovecs forces the caller to reconstruct their iovecs for every call, rather than reusing them for multiple calls (possibly with some modifications, but not rewriting them from scratch). This pessimizes the fast path in favor of optimizing the slower path.
I would propose changing this to an immutable reference, instead. If the initial write comes back partial,
write_all_vectored
can either copy the iovecs into a small stack buffer, or (in the uncommon case of a huge slice of iovecs) it can do a separatewrite
syscall for the initial buffer.
I have to say I disagree with this approach. Do you want to add a stack of 1024 (IOV_MAX
) iovec
to the function for copying the buffers?
I know the current API is a little odd at first, but when you assume that the buffer (slices) you passed in are not usable after the call it's an OK API to use. Futhermore as it's currently implemented it's zero overhead, while your suggestion of copying is not. If you're doing vectored I/O in the first place it's likely you care about this.
This would both optimize the common case, and make all cases easier to invoke.
We discussed this in the @rust-lang/libs-api meeting today. Some others felt that this was a reasonable approach to simplify the API; others didn't feel strongly either way; nobody felt that we should keep the current
&mut
API.
I didn't attend this meeting, nor am I part of the libs team, but I do feel strongly against this preposed case. May I ask how many of the people actually use the API? Because that might influence their vote. (hope I don't come off too strong with the last sentence, I'm just curious if people deciding on this are actually using the API)
@Thomasdezeeuw I'm not suggesting that we optimize for the 2-3 case; I'm suggesting that we don't need to optimize heavily for an already-less-optimized subcase (partial writes) of the 1024 case, at the expense of both usability (not having to recreate buffers) and other optimization (not having to recreate buffers).
I have to say I disagree with this approach. Do you want to add a stack of 1024 (
IOV_MAX
)iovec
to the function for copying the buffers?
No, I personally would add a local array of 8, and if larger than that, call write
on any partial iovec before calling writev
on the remainder.
Also, we have helpers that'd make it easy to write a mutate-in-place write_all_vectored
using write_vectored
.
as it's currently implemented it's zero overhead, while your suggestion of copying is not.
It's not zero-overhead if you count the cost of recreating a new set of iovecs for each call.
If you're doing vectored I/O in the first place it's likely you care about this.
In the optimal case, your writev
call completes in a single syscall; in any other case, syscall overhead will by far exceed any other overhead. If you're doing vectored I/O for efficiency, you should try to ensure your operation completes in one writev
call.
@Thomasdezeeuw I'm not suggesting that we optimize for the 2-3 case; I'm suggesting that we don't need to optimize heavily for an already-less-optimized subcase (partial writes) of the 1024 case, at the expense of both usability (not having to recreate buffers) and other optimization (not having to recreate buffers).
I agree that the usability improvement is nice, but I've found I always have to recreate my buffers anyway (see below). If it's such a big problem maybe this needs a new type that wraps &mut [IoSlice]
to show that ownership is passed to the function?
I have to say I disagree with this approach. Do you want to add a stack of 1024 (
IOV_MAX
)iovec
to the function for copying the buffers?No, I personally would add a local array of 8, and if larger than that, call
write
on any partial iovec before callingwritev
on the remainder.
That would make this function unusable for my case, which would be a real shame.
Also, we have helpers that'd make it easy to write a mutate-in-place
write_all_vectored
usingwrite_vectored
.as it's currently implemented it's zero overhead, while your suggestion of copying is not.
It's not zero-overhead if you count the cost of recreating a new set of iovecs for each call.
But you'll almost always have to recreating the iovec
s before the write call. Take your example of a ring buffer; how do you write into it while maintaining your iovec
s? I don't think it's possible you can as you need both a mutable (for writing) and immutable (for the writev call) reference into the data, which Rust of course doesn't allow. So I don't really think this argument holds water.
Could you show an example where it is possible to keep your iovec
s around while modifying the buffer?
If you're doing vectored I/O in the first place it's likely you care about this.
In the optimal case, your
writev
call completes in a single syscall; in any other case, syscall overhead will by far exceed any other overhead. If you're doing vectored I/O for efficiency, you should try to ensure your operation completes in onewritev
call.
I agree, but it's not possible for a program to ensure the operation completes in one writev
call. The amount of bytes the OS can write in a single writev
call differs per OS and even per target, i.e. different filesystem, socket, etc.
@Thomasdezeeuw
If it's such a big problem maybe this needs a new type that wraps &mut [IoSlice] to show that ownership is passed to the function?
The issue isn't just that callers will forget this (&mut
is a big hint that you can't count on it being unmodified), though that's also a problem. The primary issue seems like having to deal with it being overwritten by the function in the first place.
Could you show an example where it is possible to keep your iovecs around while modifying the buffer?
let mut iovecs = [
IoSlice::new(header),
IoSlice::new(x),
];
w.write_all_vectored(&iovecs)?;
iovecs[1] = IoSlice::new(y);
w.write_all_vectored(&iovecs)?;
Also, there are several functions we could add to make this better. For instance, we could add a function from &mut [IoSliceMut]
to &[IoSlice]
, so that people can mutate a buffer through the IoSliceMut
.
FWIW, I could also work with a write_all_vectored
that requires a &mut
but guarantees to change the slices back when done. "This uses your IoSlices as scratch space, but will always restore them to an unmodified state afterwards." But we can't make that guarantee about user impls of Write
, only about our standard implementation; we could only do that if we made this a method that you can't implement yourself (e.g. a sealed WriteExt
with a blanket impl). I don't think that's worth it.
@Thomasdezeeuw
If it's such a big problem maybe this needs a new type that wraps &mut [IoSlice] to show that ownership is passed to the function?
The issue isn't just that callers will forget this (
&mut
is a big hint that you can't count on it being unmodified), though that's also a problem. The primary issue seems like having to deal with it being overwritten by the function in the first place.
To be fair that's why the documentation says
Once this function returns, the contents of bufs are unspecified, as this depends on how many calls to
write_vectored
were necessary.
If you program with that in mind, essentially passing ownership of bufs
to the function (this why I think a type might help), I don't think its a problem. But I think I'm just rehashing points I've made before.
Could you show an example where it is possible to keep your iovecs around while modifying the buffer?
let mut iovecs = [ IoSlice::new(header), IoSlice::new(x), ]; w.write_all_vectored(&iovecs)?; iovecs[1] = IoSlice::new(y); w.write_all_vectored(&iovecs)?;
What I meant what do you have an example where you actually modify the underlying buffers. If you have the headers prepared (especially if it's just a single buffer) I don't think the overhead of recreating it is too much.
Also, there are several functions we could add to make this better. For instance, we could add a function from
&mut [IoSliceMut]
to&[IoSlice]
, so that people can mutate a buffer through theIoSliceMut
.FWIW, I could also work with a
write_all_vectored
that requires a&mut
but guarantees to change the slices back when done. "This uses your IoSlices as scratch space, but will always restore them to an unmodified state afterwards." But we can't make that guarantee about user impls ofWrite
, only about our standard implementation; we could only do that if we made this a method that you can't implement yourself (e.g. a sealedWriteExt
with a blanket impl). I don't think that's worth it.
That could, especially if the restoring can be optimised away if the compiler can see the buffers aren't used any more. 👍
People primarily seem to use this for cases like writing from a ringbuffer and handling wraparound (using two iovecs), or writing a header and body.
For what it's worth, I was planning to use it for the case of assembling a text document from potentially thousands of pieces. For my use case, making a big Vec<IoSlice<_>>
that I only use for a single call and then drop is exactly what I want.
Accepting a Vec
directly, so that it is dropped afterwards, would fit my needs and would eliminate the "wait heck some of my IoSlices got altered?!" problem. It wouldn't be optimal for other cases, though, including the "adding a header" case.
You know, now that I think about it, I'm totally free to just implement that function myself in terms of write_vectored
...
How about an owned type that wraps T: AsRef<[IoSlice<'_>]>
(or Deref
) to indicate that the write call "owns" the buffers and can modify them, but allow the buffers wrapper type to be reused (i.e. the Vec
in Vec<IoSlice<'_>>
). It makes the API a bit more conflicted, but perhaps that will resolve @joshtriplett concerns.
@Thomasdezeeuw wrote:
How about an owned type that wraps
T: AsRef<[IoSlice<'_>]>
(orDeref
) to indicate that the write call "owns" the buffers and can modify them, but allow the buffers wrapper type to be reused (i.e. theVec
inVec<IoSlice<'_>>
). It makes the API a bit more conflicted, but perhaps that will resolve @joshtriplett concerns.
That doesn't address this concern I wrote above:
The issue isn't just that callers will forget this (
&mut
is a big hint that you can't count on it being unmodified), though that's also a problem. The primary issue seems like having to deal with it being overwritten by the function in the first place.
What if we used a Cow
'd slice? If you never want your slice copied even in the partial write case, pass in a Cow::Owned, and give up your own ownership; if you don't want to give up ownership of your slice and allow mutating it, pass in a Cow::Borrowed.
Also, we could avoid an extra syscall even in the borrowed case, if we're willing to allocate in that case, and Cow will take care of that for us.
What if we used a
Cow
'd slice? If you never want your slice copied even in the partial write case, pass in a Cow::Owned, and give up your own ownership; if you don't want to give up ownership of your slice and allow mutating it, pass in a Cow::Borrowed.Also, we could avoid an extra syscall even in the borrowed case, if we're willing to allocate in that case, and Cow will take care of that for us.
@joshtriplett I'm sorry, but I don't really have the energy for this issue any more... You're comment completely misses the point of vectored I/O. If people are looking at using vectored I/O it's likely they're looking at it for performance reasons, suggesting to just allocate a new buffer so it can be used once in a system call is really quite far off this target.
It's been more than two years, the API works and has zero overhead as-is, but yes it has a gotcha. At this point either stabilise it or remove it, I'm no longer interested, this has just taken too long.
So, I do care very much about vectored I/O as well, and I care about the use case of being able to reuse the iovecs repeatedly, which is also something people do in C for reasons of both performance and simplicity. I'd like to be able to do that somehow in Rust, and not have to throw away and recreate the array of iovecs every time. I'm trying to brainstorm some data structure that will allow for that reuse. I was trying to favor using existing constructs and types (at the expense of incurring overhead in what I saw as the less common case of partial write), but I acknowledge that that's not ideal. I agree that Cow wasn't a great idea.
As an alternative closer to what you already proposed, perhaps it makes more sense to make a new IoSlices
structure that owns the iovecs as you suggested, but whose API guarantees that when the write method is done those iovecs are un-mutated? I would be happy to do that design work.
Rough sketch of a solution that shouldn't need to make any copies at all:
impl IoSlices {
fn borrow_for_write(&self, offset: usize, impl FnOnce(&[IoSlice]) -> io::Result<usize>) -> io::Result<usize>;
}
If you have an &IoSlices
(which write_all_vectored
would get) you can call borrow_for_write
which will mutate the slices using interior mutability, give you those slices so you can writev
them or equivalent, and then modify them back (which hopefully we can optimize away if the IoSlices is going away). If you have a &mut IoSlices
or an IoSlices
, we can provide ways to get a &mut [IoSlice]
, but write_all_vectored
would take an &IoSlices
so it can't arbitrarily mutate the slices. (We can provide analogous types corresponding to IoSliceMut, and additionally a method that transmutes to give an &IoSlices
so that you can use the same buffer for both read and write, and also modify through it when not currently reading or writing.)
That should avoid any copying, avoid any more syscalls than absolutely necessary, and overall make this exactly as efficient and capable as writev
/readv
can allow.
Does something like that sound reasonable?
I wrote up a quick implementation of something similar to the proposed design. It's a pretty straightforward elaboration of advance_slices
that takes care to preserve the original version of the current IoSlice
and restores it each time the &[IoSlice]
is advanced one step. Because only 1 buffer is ever in a mutated state at a time, it's easy to keep a copy of the original version and restore it when we're done with it. It also restores on drop to help keep io::Error
returns clean.
struct BufferManager<'a> {
buffers: &'a mut [io::IoSlice<'a>],
saved: Option<io::IoSlice<'a>>,
}
impl BufferManager<'a> {
pub fn new(buffers: &'a mut [io::IoSlice<'a>]) -> Self {
// strip empty buffers
let first_non_empty = buffers
.iter()
.position(|buf| buf.len() > 0)
.unwrap_or(buffers.len());
let buffers = &mut buffers[first_non_empty..];
Self {
buffers,
saved: None,
}
}
pub fn get(&self) -> &[io::IoSlice<'a>] {
self.buffers
}
pub fn count(&self) -> usize {
self.buffers.len()
}
pub fn advance(&mut self, amount: usize) {
while let Some((head, tail)) = self.buffers.split_first_mut() {
// The head is smaller than the overall write, so pop it off the
// the front of the buffers. Be sure to restore the original state,
// if necessary.
if let Some(remaining) = amount.checked_sub(head.len()) {
amount = remaining;
if let Some(saved) = self.saved.take() {
*head = saved;
}
*buffers = tail;
}
// The head is larger than the overall write, so it needs to be
// modified in place. Be sure to save the current state of the
// buffer, if necessary.
else {
if amount > 0 {
self.saved = Some(self.saved.unwrap_or(*head));
*head = io::IoSlice::new(&head[amount..]);
}
return;
}
}
assert!(amount == 0, "advanced too far")
}
}
impl Drop for BufferManager<'_> {
fn drop(&mut self) {
// When the buffer manager is dropped, restore the state of the
// current buffers head, if necessary. It shouldn't be possible for
// there to be a saved value while the buffer list is empty.
if let Some(head) = self.buffers.first_mut() {
if let Some(saved) = self.saved {
*head = saved
}
}
}
}
Then, write_all_vectored
is basically the same as it is now, but using BufferManager
to handle slice advancing
fn write_all_vectored(
&mut self,
buffers: &mut [io::IoSlice<'_>],
) -> io::Result<()> {
let mut buffers = BufferManager::new(buffers);
while buffers.count() > 0 {
match dest.write_vectored(buffers.get()) {
Ok(0) => return Err(io::ErrorKind::WriteZero.into()),
Ok(n) => buffers.advance(n),
Err(err) if err.kind() == io::ErrorKind::Interrupted => {}
Err(err) => return Err(err),
}
}
Ok(())
}
The only downside here is that it does require the user to provide an &mut [IoSlice]
, but we can still promise at the API level (though not the type level) that the slices will be unmodified when this function returns.
Can an extra method, or even option or something, be added to allow retrying the operation with data that wasn't successfully written? I'd like to tolerate some inner write errors like EAGAIN
/ErrorKind::WouldBlock
without having to reimplement write_all_vectored
manually.
Can an extra method, or even option or something, be added to allow retrying the operation with data that wasn't successfully written? I'd like to tolerate some inner write errors like
EAGAIN
/ErrorKind::WouldBlock
without having to reimplementwrite_all_vectored
manually.
We don't do this for the write_all
function, so I don't think write_all_vectored
should do anything different.
Would you consider an alternate signature that returns the total number of bytes written?
fn write_all_vectored(&mut self, bufs: &mut [IoSlice<'_>]) -> Result<usize>
I am aware that it is not consistent with the signature of write_all
, but callers can implement the behavior trivially there:
dest.write_all(&buf)?;
Ok(buf.len)
Unless I am missing something crucial, callers of write_all_vectored
cannot ergonomically or efficiently do the same. To implement the behavior they would need to iterate over bufs
and call .len()
on each IoSlice
to sum up the total length. That's at least one extraneous pointer dereference per buffer.
In contrast, write_all_vectored
can accumulate the total much more efficiently. The necessary information is already being returned by the underlying calls to writev
and/or write
. I would expect most of the time this will compile down to be one extra ADD <reg>,<reg>
instruction per underlying write call.
In my experience (mostly embedded Linux) the returned count of bytes written is occasionally useful at runtime and extremely useful in unit tests. Since users cannot implement this efficiently without re-implementing write_all_vectored
, and the underlying write calls are already returning the information, it seems reasonable to have write_all_vectored
pass it along instead of discarding.
Example:
fn send_entercnd1(password: &str, dest: &mut impl std::io::Write) -> std::io::Result<usize> {
Ok(dest.write_all_vectored(&[
IoSlice::new(b"AT!ENTERCND=\""),
IoSlice::new(password.as_bytes()),
IoSlice::new(b"\"\r"),
])?)
}
vs
fn send_entercnd2(password: &str, dest: &mut impl std::io::Write) -> std::io::Result<usize> {
// I know two of these have constant length and it could just return Ok(15+password.len())
// In general this won't be the case. I didn't see the need to complicate the example.
let mut bufs = &[
IoSlice::new(b"AT!ENTERCND=\""),
IoSlice::new(password.as_bytes()),
IoSlice::new(b"\"\r"),
];
// get length before `write_all_vectored`
let length = {
let mut acc = 0;
for buf in bufs {
acc += buf.len();
}
acc
};
dest.write_all_vectored(&mut bufs)?;
Ok(length);
}
Would you consider an alternate signature that returns the total number of bytes written?
This can trivially be done via bufs.iter().map(|s| s.len()).sum()
. If your bufs
here is the top-level array itself and there's only a few entries, the compiler will likely even be able to avoid the loop.
Can an extra method, or even option or something, be added to allow retrying the operation with data that wasn't successfully written? I'd like to tolerate some inner write errors like
EAGAIN
/ErrorKind::WouldBlock
without having to reimplementwrite_all_vectored
manually.We don't do this for the
write_all
function, so I don't thinkwrite_all_vectored
should do anything different.
We got bitten in our codebase by a bug caused by us handling ErrorKind::WouldBlock
by retrying the write_all_vectored()
call with the same bufs
input, despite the documentation advising against it to be fair.
Still it seems that this scenario could either be 1) more explicitly disallowed through type-checking by e.g. making bufs
immutable or 2) supported by making bufs
mutable "all the way" by making its type match IoSlice::advance_slices()
's bufs
argument (&mut &mut [IoSlice]
). We went with 2) in our codebase.
This is a tracking issue for
io::Write::write_all_vectored
.Feature gate:
#![feature(write_all_vectored)]
.Steps:
Unresolved questions:
IoSlice
s as mutable slice and modifies them. This is a pretty unusual and potentially error-prone API. We could either find a way to not mutate the argument or to enforce (via the type system) the argument can't be used by the user afterwards. Or we can decide that such an unusual API is fine for this rather low level method.Original issue: