I'm sorry if this is flat-out wrong or just too unimportant to fix; I'm new
here.
Encode allocates MaxEncodedLen(len(src)) bytes for dst. However, the (very
well-explained) scenario motivating the formula in MaxEncodedLen(srcLen)
involves an encoder emitting 5-byte-long copy elements each copying four bytes.
But snappy-go can't encounter that scenario because it emits at most
3-byte-long copy elements, each copying at least four bytes, regardless of
block length.
Given the restrictions on what snappy-go emits, the worst case I can see is a
3-byte-long copy element that copies four bytes, followed by a 2-byte-long
literal element with length 61, followed by the literal bytes, repeated for the
whole block: then it takes 66 bytes to encode every 65 bytes. Add three bytes
for the length, and 64K encodes to 65536*(66/65)+3=66547 bytes, 10393 fewer
than the 76940 bytes actually allocated. Maybe someone can find a worse case.
If snappy-go did, in the future, emit 5-byte copy elements, presumably it still
wouldn't do so in a 64K block, because offsets within a 64K block always fit in
two bytes.
(If MaxEncodedLen is meant to represent the worst possible blowup with the
Snappy encoding, rather than the worst possible blowup with snappy-go or other
currently-existing encoders, note that the encoding appears to allow
5-byte-long copy elements to only copy one byte, so the absolute most
pessimistic upper bound may be len(src)*5 plus space for the preamble.)
Original issue reported on code.google.com by twotwo...@gmail.com on 6 Aug 2014 at 6:18
Original issue reported on code.google.com by
twotwo...@gmail.com
on 6 Aug 2014 at 6:18