Open the8472 opened 1 year ago
Does sendfile
support send part of a file? Kafka uses this syscall to implement its zero-copy record transferring:
// PlaintextTransportLayer.java
fileChannel.transferTo(position, count, socketChannel);
.. and I suppose one write a similar system with Rust should be able to achieve the same.
Yes, sendfile can send parts. But specifying seek positions doesn't make sense for all possible readers because some things aren't seekable (e.g. sockets and pipes). Specifying a length is possible but we can model that with Take
on the writer side.
I suppose for some uses it can make sense to reuse a file descriptor and use sendfile with explicit offsets so multiple streams can be served from the same fd. That'd be incompatible with the Read
/Write
traits which assume they update the implicit seek position.
@the8472 I can live with an API like the above, i.e., file.copy(offset, len, socket)
. The current io::copy
try to send full of the file so it isn't suitable for me.
reuse a file descriptor and use sendfile with explicit offsets
Yes. I can calculate the offset. But I don't find a function in std to accept an offset
arg and delegate to sendfile if the platform supports it.
The nightly API for anonymous pipe has been merged (tracking issue rust-lang/rust#127154)
It seems that we'd want some zero-copy API for pipe, since rust-lang/rust#108283 just rolled back some optimization for copying from file to pipe
Proposal
Problem statement
It would be useful to have a version of
io::copy
that can use:Motivation, use-cases
Solution sketches
An API specific to file descriptors
It is less generic than
io::copy
but makes it explicit it only operates on file-like types and may need an intermediate buffer to hold data across multiple invocations when doing non-blocking IO.Unclear: Whether it should return an error when offload isn't possible or silently fallback to io::copy.
Downsides:
cfg()
sLean on specialization
This is essentially the same as today's
io::copy
does but with altered guaranteesWouldBlock
occurs. Otherwise the bytes will be droppedsource
afterzero_copy
returns may become visible insink
, as is the case when usingsendfile
orsplice
Downsides:
Hybrid of the above
Make the buffer an explicit argument for non-blocking IO but use best-effort specialization for the offloading aspects.
Encapsulate the copy operation in a struct/builder
Rough sketch:
Under the hood it could still try to use specialization if the platform-specific APIs aren't used.
Any of the above, but N pairs instead of 1
When copying many small files and the like it can be beneficial to run them in batches. It's not a full-fledged async runtime that could add work incrementally as other work items complete but still more efficient than doing one-at-a-time.
Under the hood we could use polling or io_uring where appropriate.
Links and related work
What happens now?
This issue is part of the libs-api team API change proposal process. Once this issue is filed the libs-api team will review open proposals in its weekly meeting. You should receive feedback within a week or two.