Open xiaoxiaoHe-E opened 1 year ago
I think we figure out the reason now:
sudo tc qdisc add dev ens192 root netem corrupt 30%
So the snappy can always report the error snappy: corrupt input
in our environment.
corrupt input
We get this error from here. But the real reason is that snappy read from a closed connection and get an empty data next time. So cannot decode incomplete data correctly. We think it's better to report an ununexpected EOF
error here.
Here is the code we modified during testing:
diff --git a/decode.go b/decode.go
index 23c6e26..208f054 100644
--- a/decode.go
+++ b/decode.go
@@ -13,6 +13,7 @@ import (
var (
// ErrCorrupt reports that the input is invalid.
ErrCorrupt = errors.New("snappy: corrupt input")
+ ErrEof = errors.New("snappy: unexpected EOF")
// ErrTooLarge reports that the uncompressed length is too large.
ErrTooLarge = errors.New("snappy: decoded block is too large")
// ErrUnsupported reports that the input isn't supported.
@@ -111,7 +112,7 @@ func (r *Reader) Reset(reader io.Reader) {
func (r *Reader) readFull(p []byte, allowEOF bool) (ok bool) {
if _, r.err = io.ReadFull(r.r, p); r.err != nil {
if r.err == io.ErrUnexpectedEOF || (r.err == io.EOF && !allowEOF) {
- r.err = ErrCorrupt
+ r.err = ErrEof
}
return false
}
(END)
I got this error because the net.Conn being read multiple times concurrently
Hi team, we met an error
snappy: corrupt input
while using snappy to compress through a TCP connection.How we build the connection:
On the source side:
On the destination side:
We get the error
snappy: corrupt input
whenconn.Read(buf)
. And this error happens intermittently.Is this caused by network problem?
We read some snappy code, and we know that this error is reported because the checksum or the decode result length is wrong. But we use snappy based on a TCP connection and TCP can guarantee data integrity. So if snappy needs several network packages to decode the complete data? Or this is caused by some problem with the network card, the hardware cannot verify the data correctly.
Is this caused by memory problem?
I also find some discussions on the network that suspect this is caused by memory overflow or runtime memory limit not enough. But I cannot make sure. Since we don't get any other error messages.
I'd appreciate any help or suggestions on how to debug. Thanks!