Open fredwangwang opened 1 year ago
See also #16474.
The core issue is that the extended interface is always called if it exists. What one wants is a mechanism like errors.Unwrap or http.ResponseController.Unwrap or fs.ErrNotImplemented to signal that the method doesn't actually implement a direct ReaderFrom/WriterTo and shouldn't be used in a particular case.
Thanks for the context @ianlancetaylor!
just thinking out..
What one wants is a mechanism like errors.Unwrap or http.ResponseController.Unwrap or fs.ErrNotImplemented to signal that the method doesn't actually implement a direct ReaderFrom/WriterTo
@carlmjohnson It does make sense to me. However, agree with the comment https://github.com/golang/go/issues/16474#issuecomment-354825553 here:
... such as io.ErrNotSupported, but it would likely break callers on a massive scale
Considering v2 might not exist ever, to me at least, having something that is non-breaking yet making io.CopyBuffer
less surprising has its merit.
Also interestingly mentioned here https://github.com/golang/go/issues/16474#issuecomment-235310031
Conceptually, it seems that ReaderFrom should have a buffer argument and ignore it if it is not needed or nil
this is actually what this proposal about, but in a non API breaking way.
there are two usecase of io.CopyBuffer
that I am aware from reading the linked issue:
To optimize.
I suspect this is the reason most people use io.CopyBuffer
. By adding io.ReaderFromBuffer
interface and implements it in things like os.File
, io.TcpConn
, etc. The kernel fast path can still happen behind scene, and if it does not apply, it will gracefully fallback to io.CopyBuffer
instead of io.Copy
.
The net result is at least as fast as the unintuitive hack, with the exact same memory footprint
io.CopyBuffer(struct{ io.Writer }{dst}, struct{ io.Reader }{src}, buf)
To shape. I am not positive that anyone actually relies on this. But it is mentioned here https://github.com/golang/go/issues/16474#issuecomment-254361220 in a test case:
... another case where the behavior of io.CopyBuffer was unexpected. The test relied upon copy being done in specific chunk sizes
Now there are two scenarios to this,
io.ReaderFromBuffer
wraps a file and fast path is available (e.g. io.TcpConn). Then shaping is not supported. But I doubt how much value there is to shape the chunk size when it is handling at the kernel level. It is important for kernel drivers for sure, but not so much with go at user space level.io.ReaderFromBuffer
is just some code. Then shaping can still happen -- the struct can just take the passed in buffer and do whatever it needs with it.Bottomline, having io.ReaderFromBuffer
still does NOT guarantee the passed buffer will be used (if faster path available), but it will not result in unexpected additional allocations like io.ReaderFrom
does.
Just starting returning io.ErrNotSupported
in existing methods would break existing clients, yeah. What you'd want to do is to make a new interface, to avoid bikeshedding call it io.ReaderFromV2
and io.WriterToV2
, and then those would be allowed to return io.ErrNotSupported
, which io.Copy
/io.CopyBuffer
would know about and be able to deal with.
ErrNotSupported
is #41198, which was accepted but I've always feel a bit leery about implementing it.
Yes. I wrote on #41198 that the proposal is flawed because “ErrNotSuported” is domain specific. Not supporting http.FlushError is different than not supporting io.ReaderFrom. An overhaul of io.Copy should have io-specific ErrNotSupported variables.
Change https://go.dev/cl/475575 mentions this issue: io: optimize copyBuffer to make use of the user-provided buffer for fallbacks
When calling
io.CopyBuffer(writer, reader, buf)
, it tries to use theio.ReaderFrom
interface onwriter
if it is available. For examplenet.TcpConn
implemets theio.ReaderFrom
and will try to use syscalls likesplice
orsendfile
to avoid additional copy in the user space. This is all well-intended and provides better performance in the case of copying from one net conn to another. However. If thereader
is not a real file, but just some struct implementsio.Reader
, it actually falls back to usingio.Copy
.This is problem-some, as the very intention of using
io.CopyBuffer
is that we get to control the buffer and potentially pool the buffers to avoid GCs. But the generic fallback toio.Copy
makes the buffer passed in useless and makes an additional allocation.This behavior is clearly shown in the following heap profile (left is the buffer passed in, right is the allocation made by genericReadFrom fallback (io.Copy)):
This is a very surprising behavior to the end developers without the understanding of TcpConn readfrom internals. As an obvious optimization of using io.CopyBuffer in the hot-path turns out to be even worse than just call io.Copy.
So would propose to add an interface similar to io.ReaderFrom, but allows to take additional buffer, thus the name
io.ReaderFromBuffer
. This way nice optimization regarding fd copies can still happen, but in case none applies, the genericReadFrom can fallback toio.CopyBuffer
instead ofio.Copy
, to take advantage of the buffer passed in, if there is any.