FreifunkFranken / fff-monitoring

Freifunk Franken Monitoring
https://monitoring.freifunk-franken.de/
9 stars 12 forks source link

Router pages are huge #228

Open lemmi opened 10 months ago

lemmi commented 10 months ago

A single initial page load of a router clocks in at about 3MB of data in total. Through compression of the main page, only 1.5MB are actually transferred over the wire. Each reload after that comes in at about 1MB.

There are a couple of low hanging fruit that can be picked for some easy improvements:

precompress assets

The used assets are transferred uncompressed. Using a compressed transfer-encoding, these assets could yield at least a 0.5MB improvement for the initial page load: Encoding Transferred Size
Uncompressed ~750KB
gzip ~200KB
brotli ~170KB

use a more appropriate data format for /api/load_netif_stats/

On each page load, at least one call to /api/load_netif_stats/ is made. A single requests requires 500kB-600kB. This can easily brought down by providing another api endpoint that serves a different format. A fitting choice could be Apache Parquet. It well supported in multiple languages, especially in python and js. Just using the included delta encoding brings the size down to 60kB. Additionally enabling compression can further improve this to 50kB at the cost of more overhead. Integration should be very easy. Here is the small test program I used to compare the sizes:

from pyarrow import json
import pyarrow.parquet as pq

table = json.read_json('br-client.json')
pq.write_table(table, 'br-client.parquet', use_dictionary=False, compression='NONE', column_encoding='DELTA_BINARY_PACKED')

split router stats into api

The delivered html embeds a huge portion of the stats inline as javascript variables here . This is problematic for several reasons.


Once these or similar changes are made (there might be another file format more suited for example), there is another option to vastly improve the server load and transfer sizes.

caching

Historic data will not change. Therefore there is no reason to keep resending everything. Instead, very deliberate use of caching should be made.

A simple scheme to achieve this could be the following:

More concretely: Rather than performing a single request to /api/load_netif_stats/XYZ, the client should instead make multiple requests:

/api/load_netif_stats/XYZ                   # still dynamically generated, but only up to the last hour
/api/load_netif_stats/2006-01-02T04:00-XYZ  # contains all data from the second of january 2006 from 4:00 to 5:00
/api/load_netif_stats/2006-01-02T03:00-XYZ  # same but one hour earlier
/api/load_netif_stats/2006-01-02T02:00-XYZ
/api/load_netif_stats/2006-01-02T01:00-XYZ
/api/load_netif_stats/2006-01-02T00:00-XYZ
/api/load_netif_stats/2006-01-01-XYZ        # contains all data for the first of january 2006
/api/load_netif_stats/2005-12-XYZ           # contains all data for the month december of 2005

Everything except the first request can be heavily cached, potentially forever, on the client. The server also only ever needs to provide data for recent events dynamically and can then generate historic data once.

With this, a page reload should be only as much as 60kB uncompressed, or 7kB (!) compressed for the html, and an additional request for the most recent historic data, which should be in the order a couple of hundred bytes to few kilobytes.

adschm commented 9 months ago

On a quick look, most of these ideas are valid.

I do not consider the page size as dramatic as stated by lemmi, but that does not mean we shouldn't get some of the low-hanging fruits.

Still, somebody will have to invest considerable time in it, and due to the design of the monitoring even "easy" changes might not be so quick after all.