droyo / styx

Go library for the 9P filesystem protocol
MIT License
64 stars 18 forks source link

Consuming IO messages across decoder buffer boundary corrupts decoder state #2

Closed droyo closed 6 years ago

droyo commented 6 years ago

One of the constraints I made when writing the styxproto package was that it use a fixed-size buffer per connection. Another, perhaps unnecessary constraint was to support very small (~.5KB) and very large (2GB) msize parameters, independent of the internal buffer size. One consequence that came out of these constraints is that the two largest messages, Rread and Twrite, do not support random access. Instead they implement the io.Reader interface.

When one of these messages does not fit completely in the Decoder buffer, io.MultiReader is used to combine the buffered data (using a bytes.Reader) with an *io.LimitedReader that consumes from the underlying connection, like so:

m.r = bytes.NewReader(buffered)
if int64(len(buffered)) < count {
    m.r = io.MultiReader(
        m.r,
        io.LimitReader(r, count-int64(len(buffered))))
}

When I wrote this, I thought I was being clever. Unfortunately, as far as I can tell, it's never worked. This would only manifest when reading or writing large files.

After doing a bit of troubleshooting, I believe the problem is in decoder.go, here:

func (s *Decoder) Next() bool {
    if s.msg != nil {
        s.err = discard(s.br, s.msg.Len())

discard() gets rid of msg.Len() bytes from the bufio.Reader, but the underlying connection has already been advanced by msg.Len() - len(buffered) bytes.

droyo commented 6 years ago

One complication is that this only occurs when the Read method is called on one of these messages. If all messages were fully consumed, this would be as simple as replacing discard(s.br, s.msg.Len()) with discard(s.br, len(s.msg.bytes())). But since we can't guarantee that, we have to preface it with io.Copy(ioutil.Discard, s.msg).