Closed cannium closed 7 years ago
Hi cannium! I appreciate that you put a lot of work into this, but I'm not sure it's something I want to merge into the project. I really wish you'd opened an issue first to talk about it before putting the work in =/ At any rate, I have a bunch of questions:
bufio.Reader
that's already being used? Same questions for bufio.Writer
, though admittedly I'm not explicitly using a bufio.Writer
so that's an improvement that could probably be made.RawResponseCommand
method looks really complicated and confusing. Can you explain what it's for?Again I appreciate that amount of work you put in and that you really dove into the internals here, but in the future definitely make an issue beforehand telling me what code you want to write, so we can determine upfront if it's the best way forward.
Thanks!
Actually after thinking about it a little more I understand a bit better what you're trying to go for. I think the BulkStringWriter and the way it's used could definitely be improved upon. In essense what you actually want is for radix to support an argument being an interface like:
type LenReader interface {
io.Reader
Len() int64
}
And then in writeRequest
it could write the length and then copy the entire reader until io.EOF
. I'm not opposed to this.
For the RawResponseCmd, I understand what you're trying to do there a bit better now too. If we were to instead make that into:
func (c *Client) CmdReader(cmd string, args ...interface{}) (io.Reader, int64, error)
Where the returned io.Reader
is actually just an io.LimitedReader
wrapping the connection. The docs will have to make it extremely clear you have to read the returned reader till io.EOF
before using the client for anything else. From there the user can wrap the returned io.Reader
in a bufio.Reader
if they want to buffer it, that'll be up to them, so there's not much need for BulkStringReader.
Again, I really appreciate the contribution, sorry if I was a bit dismissive before. I think this is definitely doable, just needs a few tweaks to take advantage of the standard library.
Also it definitely is going to need some tests :P
Sorry I didn't explain clearly in the commit message. The API proposed is a better fit for larger(maybe a few MB and more) values. Let me compare between current implementation and the proposed API.
For the current client.Cmd
,
buf []byte
) -> radix copies buf
to TCP buffer and OS kernel sends it to the other sidebuf
For the proposed API:
func(io.Writer) error
to radix -> radix handles redis protocol and calls the callback to write directly to TCP bufferfunc(io.Reader) error
to radix -> radix handles redis protocol and calls the callback to read directly from TCP buffer so user could consume at his paceActually I had thought about an API with signature func (c *Client) CmdReader(cmd string, args ...interface{}) (io.Reader, error)
, but I changed it afterwards. As you have been aware, it requires user to consume all the contents. But what if an error occurs in user's code, and how to determine when user would have done their processing for radix to consume the trailing \r\n
?
The write path API is similar, if it had been LenReader
as you proposed, an extra buffer would be included in radix to copy from LenReader
to kernel's TCP buffer. The extra buffer is unavoidable, but if user could control its size, they could tune their application better.
Admittedly I realized the method names are not straightforward and need revisions as soon as I wrote them. But I cannot come up with better ones since I'm fighting with the only two hard problems at the same time(LOL)!
Hopefully this post explains my ideas clearer. I think we could first discuss about the API signatures and then tests and documents.
But what if an error occurs in user's code
My philosophy with radix (and most of my code) is if there's an error just assume the connection is dead and make a new one. Network error's are the 0.1% case in most applications, they're not worth jumping through hoops to specially handle and try to recover from. So with that said, I'd say if there's a network error we treat it like we normally do and mark LastCritical
and close the connection. We could probably make an exception for timeouts here.
how to determine when user would have done their processing for radix to consume the trailing \r\n?
That's a little trickier, but you can wrap an io.LimitedReader
so that when you see an io.EOF
during the Read method you read off two more bytes.
So the reader would look something like this:
type cmdReader struct {
lr io.LimitedReader
c *Client
}
func (cr *cmdReader) Read(b []byte) (int, error) {
i, err := cr.lr.Read(b)
if err == io.EOF {
_, err := cr.c.conn.Read(cr.c.writeScratch[:2])
return i, err
} else if nerr, ok := err.(*net.OpError); ok && nerr.Timeout() {
// don't mark LastCritical
} else if err != nil {
cr.c.Close()
cr.c.LastCritical = err
}
return i, err
}
And like I mentioned, the user can choose to wrap that with a bufio.Reader
if they like, and reuse the bufio.Reader
with the Reset
method if they really don't want to allocate memory.
The write path API is similar, if it had been LenReader as you proposed, an extra buffer would be included in radix to copy from LenReader to kernel's TCP buffer.
There's already a writeBuf
field that can be used for that purpose. You can get the full underlying buffer by doing:
c.writeBuf.Reset()
b := c.writeBuf.Bytes()[:c.writeBuf.Cap()]
And then pass b
into io.CopyBuffer
.
As far as tuning the buffer size, I'd rather provide for tuning that writeBuf
field, since that can help cover more cases than just this one, and let's us skirt having a custom buffered writer type. Maybe func (c *Conn) SetWriteBuffer(buf *bytes.Buffer)
, which could be called right after the connection is created.
I think we could first discuss about the API signatures and then tests and documents.
Agreed. And if you'd like you can skip the documentation, I can come in after and flesh that out. I'm picky with my docs :P
After a second thought, APIs could work without callbacks, as you proposed, maybe with some changes. Write API:
type LenReader struct {
io.Reader
int64
}
I think an interface is not necessary here, user only need to pass a reader and its size.
Read API:
CmdReader(...) (io.ReadCloser, error)
Reader should actually be a ReadCloser, thus we could force user to tell radix the right time to do some clean-ups.
And about writeBuf
field. As a client could be used for both Cmd
and CmdReader
, this field should be adjusted per request, what do you plan for it?
Reader should actually be a ReadCloser, thus we could force user to tell radix the right time to do some clean-ups.
I don't think there's any reason to add another method to this, especially when all the Close method is going to do is finish reading the data off the socket, and in the vast majority of cases users will be reading all of the data off the socket anyway.
And about writeBuf field. As a client could be used for both Cmd and CmdReader, this field should be adjusted per request, what do you plan for it?
What do you mean? Whether or not Cmd or CmdReader is used isn't really relevant to the write buffer, that would only come into play when passing in a LenReader as an argument. In any case, there's no reason a SetWriteBuffer
method couldn't be called in between commands, though I'm not sure why anyone would want to.
For the LenReader
and the buffer issue, I had a better idea just now. We could instead use the io.WriterTo
interface. So something like:
type LenWriterTo struct {
io.WriterTo
Len int64
}
Then that can be used to copy directly to the connection without a buffer inside of radix needing to be used. So we can get rid of the SetWriteBuffer
method as well.
I don't think there's any reason to add another method to this, especially when all the Close method is going to do is finish reading the data off the socket, and in the vast majority of cases users will be reading all of the data off the socket anyway.
It's about stupid-proof when you design an API.
In any case, there's no reason a
SetWriteBuffer
method couldn't be called in between commands
SetWriteBuffer
is enough
type LenWriterTo struct {
io.WriterTo
Len int64
}
Well, good for you.
It's about stupid-proof when you design an API.
I agree, which is why I think we shouldn't add unnecessary methods to it.
Well, good for you.
I'm not sure what you mean. Do you not like the idea? Can you explain why?
Never mind. I'll submit a revision of the patch in a few days.
Hey @cannium it's been almost a year on this, and in the meantime radix.v3 has gotten released with its own support for io.Reader/io.Writer usage. So I'm going to go ahead and close this, but feel free to reopen it if it's something you want to keep working on.
Add new APIs for both read and write paths that passing io.Reader/ io.Writer as parameters to reduce memory copy times and memory allocation because sometimes values in Redis could be large(512MB max)