JobLeonard commented 8 years ago

We currently don't do either:

(note how other parts of the website are nicely cached)

With the example dataset it would reduce the size from 7mb to 1.7mb (based on the default compression settings of my Linux compression tool), and caching would of course remove the need for fetching the whole thing again. Both are probably worth turning on.

GZIP

An object can generally be served in gzip format if you store it with gzip compression and set its contentEncoding property to gzip. You should preserve the object's original contentType property (e.g., text/plain). Then, in order to receive a gzip-encoded response, the Accept-Encoding header must contain the associated token -- otherwise the content is likely to get decompressed at serving time. For example, a properly formed HTTP request header for receiving a gzip formatted response is:

Accept-Encoding: gzip

https://cloud.google.com/storage/docs/json_api/v1/how-tos/performance

The request header is already set, so the issue probably is that we don't store the files in gzip form?

Caching the data

Still looking into why the same JSON file is being fetched again, even if the data remains the same, and how to change this.

JobLeonard commented 8 years ago

I think we need to import this to enable gzipping server-side: https://flask-compress.readthedocs.io/en/latest/

JobLeonard commented 8 years ago

Actually, it might make more sense to store both gzipped and non-gzipped data, and send one or the other depending on the request headers.

Benefits:

saves CPU cycles; most browsers support gzipping, so it will likely be the default. Rezipping all the time is wasteful.
the server can respond more quickly to fetch requests, since it does not have to spend time gzipping the data every request
we can gzip with a higher, slower compression setting, because the compressed data will be re-used so lost time/cycles will "earn" themselves back, and this will even increase download speed

I guess this would require some retooling on both loom-pipeline and loom-server to work well together.

slinnarsson commented 8 years ago

For now, I've added on-the-fly compression using flask-compress, and verified that it does in fact compress files. Currently, this runs on http://104.197.181.233 only (not in the container)

JobLeonard commented 8 years ago

From 7,5 MB to 1,6 MB, nice!

slinnarsson commented 8 years ago

I think flask-compress gzip is good enough, so I close it for now.

JobLeonard commented 8 years ago

Agreed. It doesn't properly cache yet though - something to keep in mind. Then again, perhaps the "offline loom file" approach is more appropriate anyway.

linnarsson-lab / loom-viewer

Enable caching and gzipping when fetching resource #25

GZIP

Caching the data