emersion / go-message

✉️ A streaming Go library for the Internet Message Format and mail messages
MIT License
373 stars 108 forks source link

Documentation: Using io.TeeReader results in dataloss chunks after 4KB of message.Body in Read() function #158

Closed Zeg0 closed 1 year ago

Zeg0 commented 1 year ago

the function global Read(r io.Reader, opts *ReadOptions) (*Entity, error) needs documentation to not be used with io.TeeReader.

example:

var r io.Reader
//...
var teebuf bytes.Buffer
tee:= io.TeeReader(r, &teebuf) //<--- needs to fully write before one can fully read
m,err := message.Read(tee) // <--- results in dataloss of the Entity.Body after 4KB

reason: the Entity.Body is created with br := bufio.NewReader(lr) (entity.go : line 140) which creates a Reader. The Reader in bufio.go has a 4096 default-size. The content is normally read and refilled iteratively. However TeeReader expects to fully read before you can fully write and doesn't support this iterative behaviour. Therefore m.Body can be cropped and resulting in dataloss.

workaround: Instead of using io.TeeReader make new Readers yourself with fully read content like this:

var r io.Reader
//...
buf := new(strings.Builder)
bytes, err := io.ReadAll(r)
if err != nil {
    panic(err)
}
buf.Write(bytes)
r1, r2 := strings.NewReader(buf.String()), strings.NewReader(buf.String())
m,err := message.Read(r1) 
emersion commented 1 year ago

This is expected behavior. The same happens if you pass an io.TeeReader to bufio.NewReader directly. You need to consume the whole returned reader. You can do this with something like io.Copy(ioutil.Discard, m.Body).