Closed louh closed 7 years ago
CloudFront can only dynamically compress files that are <10Mb, so that won't help with these large files.
But what we can probably do with S3 is:
content-type
metadata on each file to application/json
content-encoding
metadata on each file to gzip
@louh try fetching https://speed-extracts.s3.amazonaws.com/2017/0/0/002/415.json I just manually set the headers on that file.
Closing this, since PBF data exports from Datastore are gzip'ed.
From a test I did earlier today, a server that gzips tiles on-the-fly gives us a >90% savings on the transferred file size:
If you look on the right side, the top number is the actual size transferred over the wire; the bottom number is the uncompressed size. Note that these were the original JSON files with unmangled properties. I did a similar test with the mangled properties, and the resulting file sizes remained pretty close, because of the nature of the gzip algorithm.
This is significantly better for download performance (but says nothing about memory or processing performance yet). What this means is that a user had to wait about ~3 minutes before to download 600MB, and about ~20-30 seconds to do 60MB. This is excellent. My goal was to get our download sizes to about 60MB per request. Even if we were to spend a bunch of time rejiggering how we roll up data, or determining what properties to include, we would probably, at best, get us 50% of the way there. By gzipping the tiles, we get 90% savings immediately with only a tweak in the infrastructure.
Please note that this does not mean serve files that were gzipped manually. This is because the browser will automatically uncompress files that were transmitted over the wire in gzip encoding. If the files were transmitted as gzipped files, the browser does not automatically uncompress it, and then you would require something in JavaScript to parse the file and gzip client side, which is not optimal. Therefore, we must have the server serve files with gzip compression turned on, which is not the same as having the export process create gzipped files.
In summary, here's my recommendations for now: