golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
124.12k stars 17.68k forks source link

x/text/transform: Writer returns n < len(p), no error #46892

Open saracen opened 3 years ago

saracen commented 3 years ago

What version of Go are you using (go version)?

$ go version
go version go1.16.2 darwin/amd64

Does this issue reproduce with the latest release?

Yes

What did you do?

type problemTransform []byte

func (problemTransform) Reset() {}

func (t problemTransform) Transform(dst, src []byte, atEOF bool) (nDst, nSrc int, err error) {
    // if we don't have an EOF, progress only 5 bytes at a time whilst asking for more data
    if !atEOF && len(src) > 5 {
        n := copy(dst, src[:5])
        err = transform.ErrShortSrc
        if n < 5 {
            err = transform.ErrShortDst
        }
        return n, n, err
    }

    // if we're at EOF or there's less than 5 bytes available, copy everything
    n := copy(dst, src)
    if n < len(src) {
        err = transform.ErrShortDst
    }
    return n, n, err
}

func TestNBelowLenP(t *testing.T) {
    w := transform.NewWriter(ioutil.Discard, problemTransform{})

    write := func(p []byte) {
        n, err := w.Write(p)
        if err != nil {
            t.Error(err)
        }

        if n != len(p) {
            t.Errorf("n %d below len(p) %d", n, len(p))
        }
    }

    write(append([]byte("1111111111"), make([]byte, 2048)...))
    write(make([]byte, 2048))

    if err := w.Close(); err != nil {
        t.Error(err)
    }
}

What did you expect to see?

Everything written and n == len(p)

What did you see instead?

t.Error raises n 2043 below len(p) 2048

I suspect this is to with the internal buffer size used, however, is this expected? The transformer is making progress, I would have expected everything to still be copied or an err when n != len(p).

Writing the same amount of data but breaking it up over multiple Write() calls does work.

/cc @mpvl Do you have any ideas?

seankhliao commented 3 years ago

I believe this is working as intended, the docs for Write says:

Call Close to convert the remaining bytes.

and if you do call that and count the bytes it writes out nothing is missing

saracen commented 3 years ago

I believe this is working as intended, the docs for Write says:

Call Close to convert the remaining bytes.

and if you do call that and count the bytes it writes out nothing is missing

@seankhliao

The example test above does call Close() to flush anything remaining. If for example, I have it write out to bytes.Buffer, at the end, 5 bytes are missing.

seankhliao commented 3 years ago

The last bytes are written when you call Close(), which you call after your check. Change your discard writer to something that can count how much is written to it and call Close before doing the check

saracen commented 3 years ago

The last bytes are written when you call Close(), which you call after your check. Change your discard writer to something that can count how much is written to it and call Close before doing the check

@seankhliao Yes, to reiterate, I can do that, and it will come up 5 bytes short of what's expected. This problem is not related to the final call to Close().

ZekeLu commented 3 years ago

@seankhliao I have written a reproducer as described in your comment, and bytes did get dropped. Here is the reproducer: https://play.golang.org/p/Zyi0VOMfXop. I think there is a bug in the transform.Writer. When the number of total bytes (excluding the bytes consumed by the first write, which is 5) is greater than the default buf size of the transform.Writer (which is 4096), the extra bytes are dropped.

Here is the unit test based on @saracen's code:

func TestBytesDropped(t *testing.T) {
    var buf bytes.Buffer
    w := transform.NewWriter(&buf, problemTransform{})

    write := func(p []byte) {
        _, err := w.Write(p)
        if err != nil {
            t.Error(err)
        }
    }

    b := bytes.Repeat([]byte("0123456789"), 206)
    // default buf size used by transform.Writer
    defaultBufSize := 4096
    // it fails when the number of total bytes (excluding the bytes consumed by
    // the first write, which is 5) is greater than the default buf size
    total := defaultBufSize + 5 + 1
    firstWrite := 2058
    write(b[:firstWrite])
    // some bytes are dropped in the second write
    write(b[:total-firstWrite])

    if err := w.Close(); err != nil {
        t.Error(err)
    }

    if buf.Len() != total {
        t.Errorf("want number of bytes: %d, got: %d", total, buf.Len())
    }
}
vipcxj commented 11 months ago

The last bytes are written when you call Close(), which you call after your check. Change your discard writer to something that can count how much is written to it and call Close before doing the check

Docs say Write must return a non-nil error if it returns n < len(p). So it really is a bug.