centerclick / feedback

Issues, Bug Reports, and Feature Requests
7 stars 0 forks source link

Webserver output should use compression #85

Closed tlhackque closed 1 year ago

tlhackque commented 1 year ago

I noticed that the client list webpage is about 2MB (for the max 5000 clients shown). Looks highly compressible.

But the response headers don't show any Content-Encoding. (With Accept-Encoding: gzip, deflate requesting compression.) That doesn't matter much on a local network, but anyone remote (and especially on a mobile data plan) would notice.

Downloading the full client list as a CSV (~9MB currently) would also benefit from compression.

Should be easy to turn on: lighttpd mod_deflate can do this for text/html and text/css.

Javascript too, if you ever start using it (e.g. for an admin interface).

For a rough estimate, I put the source for my current clients page into a file, gzip compressed the data 9.5 : 1.

For a CSV dowload (~42K clients), I got ~4:1 compression from gzip.

(mod_deflate and the browser will negotiate what compression is used; some do better than gzip.)

 ls -l clients.tmp
-rw-r--r-- 1 root root 1812542 Dec 29 16:55 clients.tmp
ls -lh clients.tmp
-rw-r--r-- 1 root root 1.8M Dec 29 16:55 clients.tmp
gzip clients.tmp
ls -l clients.tmp.gz
-rw-r--r-- 1 root root 191664 Dec 29 16:55 clients.tmp.gz
ls -lh clients.tmp.gz
-rw-r--r-- 1 root root 188K Dec 29 16:55 clients.tmp.gz

ls -lh download.csv
-rwxr--r-- 1 root root 8.9M Dec 29 17:01 download.csv
gzip download.csv;  ls -lh download.csv.gz
-rwxr--r-- 1 root root 2.2M Dec 29 17:01 download.csv.gz

You might also consider putting an Expires header on the status images so they're retained by the browser. mod_expires

dave4445 commented 1 year ago

I don't think this is possible with server.stream-response-body = 2 which is needed to stream data, not buffer it. without streaming, lighttpd will keep the entire response in RAM to compute the content-length. this does not work with 1M clients for JSON or CSV export as it may result in OOM or a DoS.

tlhackque commented 1 year ago

I'm not a lighttpd expert (my world is apache with a few embedded systems), but It's not obvious why compression can't be made to work.

I did notice that you're using Transfer-Encoding: chunked; that must be to avoid content length.

The Content-Length header is optional if you close the connection at the end of data. Can lighttpd do that? It would be a worthwhile trade-off for the client page and data downloads...

gzip doesn't need content length; it can compute on a stream. No reason for lighttpd to hold the uncompressed data, so if it really needs content-length, the required buffer would be the compressed size. I haven't hit 1M clients, so don't know what that translates to. The mod_deflate doc isn't clear, but some experimentation (perhaps with a debugger) should resolve the matter.

For the 1M client case, you could send an actual .gz (or .zip) file. That would require the user to unzip it, but since you're sending the big data as an attachment, it shouldn't be a big issue for users. They may prefer to store it compressed. Or they can expand on receipt - e.g. curl | gunzip >expanded. I expect you can stream that through a pipe to lighttpd viz gzip as easily as the uncompressed.

I guess this needs some more investigation. But not tonight for me.

Speaking of DOS; do you have a limit on the number of simultaneous http connections? If not, you should. Probably easier to manage as a maximum number of login sessions, but in any case unlimited connections is challenging for embedded systems.