Closed frederikaalund closed 6 years ago
What is exact reason for dropping connections? Timeouts on sending/receiving data?
If both client and server negotiates that they want and able to use compression -- why force disabling it?
I don't see why aiohttp should make assumption about data content. Binary data could be random but very often they are structured and could be compressed. You can send a random text over websocket as well :)
If you don't need a compression -- just disable it.
The parameter's default value is kind of frozen. To don't make a mess we should support a consistency for all aiohttp 3.x releases.
We can return to the question at aiohttp 4.0 development time but I still not convinced that we should. I see very many cases when compression is useful (maybe with lower throughput)
Thanks for the quick reply.
What is exact reason for dropping connections? Timeouts on sending/receiving data?
Timeout on send_bytes
. I've wrapped send_bytes with async_timeout.timeout(5)
.
Binary data could be random but very often they are structured and could be compressed.
Emphasis mine. If that is indeed the case, then just go ahead and close this issue. In my experience, with the data that I'm used to seeing, binary data is very often near-random (and therefore incompressible). Only rarely do I send structured binary data.
I haven't done any studies of the usual structure (or non-structure) of binary data in HTTP-based applications. It may very well be that structured binary case is the common case (and that my case is rare).
We can return to the question at aiohttp 4.0 development time but I still not convinced that we should. I see very many cases when compression is useful (maybe with lower throughput)
That makes a lot of sense. Let's see if others are having the same issue that I had. If not, just keep things as they are. Closing this for now.
@asvetlov I've noticed that aiohttp uses the default level for zlib.compressobj, which is 6. I think that is too large for a transport compression. We should set it to 1 (Z_BEST_SPEED). Setting the compress level to Z_BEST_SPEED improves nearly 100% performance for common text data like JSON, with only 2% larger result.
There are a lot of binary data that are already compressed (e.g. JPEG images, MP4 videos, MP3 audio, ZIP archives, etc). Processing such data causes a lot of CPU time, and user should disable compression explicitly (even Z_BEST_SPEED is too slow)
What https://tools.ietf.org/html/rfc7692 recommends? What other implementations does?
Compressing already compressed data is pointless, you are right. I think it is up to user do disable compression for sending such payload. Websocket library itself has no knowledge about the data nature.
@asvetlov Yes you are right. Users are responsible for these situations.
RFC said nothing about compress level, I think it is just an implementation detail, because it is transparent to the decompression side.
Syntax: gzip_comp_level level;
Default:
gzip_comp_level 1;
Context: http, server, location
Sets a gzip compression level of a response. Acceptable values are in the range from 1 to 9.
Nginx gzip module uses 1 by default.
Level 1 makes sense. Does anybody want to make a pull request?
Long story short
I recently upgraded from Aiohttp 2.2.5 to 3.0.6. After that, I got a lot of dropped websocket connections. Through investigation, I found that 2.3.0 introduced the
compression=True
parameter for websockets. It turns out, that this change caused the dropped connections. Simply settingcompression=False
(and thereby returning to the pre-2.3.0 behaviour) solved the issue.I can understand that having compression on by default makes a lot of sense for text messages (e.g.,
send_str
orsend_json
messages). We can naturally assume statistical redundancy for text messages. However, I find it strange to find compression on by default for binary messages (e.g.,send_bytes
messages). I don't think we can assume anything about binary messages. Said messages may very well be near-random (and therefore compress very poorly).In my application, I send a lot of binary data over a websocket (roughly 500 KBps) that doesn't compress very well at all. The (needless) compression caused the python process take up all CPU resources and, ultimately, drop the connection due to timeout.
Expected behaviour
I expected
send_bytes
to have compression off by default.Actual behaviour
Since 2.3.0,
WebSocketResponse
has compression on by default. Since 3.0.0, it is possible to disable compression on a per-method basis (send_bytes(..., compression=False)
). However, compression is on by default, which I think is very questionable.