compress/flate: api to configure sliding window size

ukai commented 12 years ago

It seems there is no way to set window size in current compress/flate API. (it looks
const value now)

In http://tools.ietf.org/html/draft-tyoshino-hybi-websocket-perframe-deflate-05, there
is a parameter to limit the LZ77 sliding window size.
I think I need such API in go.net/websocket.

Could you add some API to set window size in future (after Go1)?

When tip added new API, could I use the package in Go1 with
import "code.google.com/p/go/compress/flate"
or so?

robpike commented 12 years ago

Comment 1:

Labels changed: added priority-later, removed priority-triage.

Status changed to Accepted.

rsc commented 12 years ago

Comment 2:

Labels changed: added go1.1maybe.

robpike commented 11 years ago

Comment 3:

Labels changed: removed go1.1maybe.

rsc commented 11 years ago

Comment 4:

[The time for maybe has passed.]

rsc commented 11 years ago

Comment 5:

Is this still needed for websocket?

Labels changed: added go1.2maybe.

rsc commented 11 years ago

Comment 6:

Labels changed: added feature.

robpike commented 11 years ago

Comment 7:

Won't happen for 1.2

Labels changed: removed go1.2maybe.

rsc commented 10 years ago

Comment 8:

Labels changed: added go1.3maybe.

rsc commented 10 years ago

Comment 9:

Labels changed: removed feature.

rsc commented 10 years ago

Comment 10:

Labels changed: added release-none, removed go1.3maybe.

rsc commented 10 years ago

Comment 11:

Labels changed: added repo-main.

rnapier commented 9 years ago

The FLATE algorithm can also adjust its memLevel (the zlib name; compress/flate calls it hashBits). This also is important to be able to scale. Both of these are very important for reducing memory usage when dealing with many connections. See https://groups.google.com/d/msg/golang-nuts/DDz3-X1JNVI/vBdjTT1tvYoJ for more discussion.

Note that these will impact compress/zlib, which currently has CINFO hardcoded as 0x78. That interface needs to support modifying window bits as well.

It may be helpful to make maxStoreBlockSize configurable as well. It currently allocates 64k per instance in the NoCompression case.

Servers implementing http2 and spdy can generate thousands or tens of thousands of long-lived zlib writers and readers, so minimizing memory usage can be critical.

bradfitz commented 8 years ago

/cc @dsnet (per golang-nuts ping of this bug)

jeffreydwalter commented 6 years ago

@bradfitz are there any plan to move forward with this? zlib has fixed several of it's outstanding issues around windowBits.

bradfitz commented 6 years ago

@jeffreydwalter, no clue. @dsnet owns this.

jeffreydwalter commented 6 years ago

@bradfitz thanks. @dsnet is this on the roadmap?

dsnet commented 6 years ago

I'm not actively working on this.

It's not clear to me whether users want the ability to configure the sliding window size for the encoder side, decoder side, or both. Secondly, the current flate API is not easily extendable to new API options. We could add a NewReaderDictWindow or NewWriterDictWindow, but it's starting to get into ugly territory.

jeffreydwalter commented 6 years ago

I think it makes sense to have tunability for both. Basically, parity with nodejs (the ability to specify windowBits and memLevel) would be nice https://nodejs.org/api/zlib.html#zlib_memory_usage_tuning

marcellanz commented 5 years ago

I'd have some time to look into this. Is there some current thinking in which direction it could/should go regarding the API surface?

molon commented 5 years ago

zlib cgo wrapper github.com/molon/zlib

marcellanz commented 5 years ago

@rnapier I did some investigation for this issue starting with your test here https://github.com/rnapier/ztest and your comment on the mailing list in 2015:

I'm still digging into how to address this, consulting the equivalent zlib code. Simply reducing maxMatchLength to 128 seems to fix it, but zlib doesn't require this hack, so I'm thinking that's the wrong direction.

I tried to understand the code and went over to zlib's implementation you mentioned as a reference.

zlib seems to have a little secret regarding the window size as it doesn't support a value for MAX_WBITS of 8 for the window size AFAIKS: https://github.com/madler/zlib/blob/v1.2.11/deflate.c#L303

where it states that: if (windowBits == 8) windowBits = 9; /* until 256-byte window bug fixed */

This seems to be document in issues for zlib https://github.com/madler/zlib/issues/171 and also in the zlib.net manual https://www.zlib.net/manual.html under "Advanced Functions"

For the current implementation of deflate(), a windowBits value of 8 (a window size of 256 bytes) is not supported. As a result, a request for 8 will result in 9 (a 512-byte window). In that case, providing 8 to inflateInit2() will result in an error when the zlib header with 9 is checked against the initialization of inflate(). The remedy is to not use 8 with deflateInit2() with this initialization, or at least in that case use 9 with inflateInit2().

And as you wrote

but zlib doesn't require this hack

its because it doesn't implement the feature properly.

@rnapier, @dsnet So in consequence what would be the recommendation to go on with this issue?

zlib seems to have a bug for a windowBits of 8 and upgrades it to 9. I could invest some time to fix that, but I'm not sure if this is worth the time and perhaps its enough to be compatible with zlib's implementation and capabilities.

Setting logWindowSize of 9 in deflate.go seems to be at least OK for the current tests in compress/flate as I can see.

marcellanz commented 5 years ago

Setting logWindowSize of 9 in deflate.go seems to be at least OK for the current tests in compress/flate as I can see.

regarding tests, I re-checked with tip go version devel +930d6ecb69 and set logWindowSize = 9 and found two test failing but deflate not panic

TestDeflateInflateString
TestReaderEarlyEOF

I haven't found yet if these two fail because of implicit expectations in the implementation about the window size not covered by the contant logWindowSize set.

marcellanz commented 5 years ago

I haven't found yet if these two fail because of implicit expectations in the implementation about the window size not covered by the contant logWindowSize set.

TestReaderEarlyEOF fails because of its expectations of the length of a sample compressed:

logWindowSize = 14, level: 1, len(compress(data)) = 220422 > limit = 218338
logWindowSize = 13, level: 1, len(compress(data)) = 223273 > limit = 218338
logWindowSize = 12, level: 1, len(compress(data)) = 227951 > limit = 218338
logWindowSize = 11, level: 1, len(compress(data)) = 234814 > limit = 218338
logWindowSize = 10, level: 1, len(compress(data)) = 244619 > limit = 218338
logWindowSize =  9, level: 1, len(compress(data)) = 258838 > limit = 218338

For TestReaderEarlyEOF I have a change but I don't yet understand why it works, at least why now all test seem to be fine with a logWindowSize in the range of [9,15]

golang / go

compress/flate: api to configure sliding window size #3155