Closed JobLeonard closed 6 years ago
A good example of an existing web-app that does offline storage really well is devdocs.io:
It turns out that some caching can be done through AppCache.
https://www.html5rocks.com/en/tutorials/appcache/beginner/
https://alistapart.com/article/application-cache-is-a-douchebag/
Specifically, the static assets: the script and the css. This will avoid having to download 900kb every time the website is refreshed.
AppCache implemented, pretty neat! Saves us about a megabyte each time we refresh.
Now I'll just need to add LocalForage support for downloaded metadata, so that we can really work off-line
(also, the reason I picked this up is because golden-layout needs to use off-line storage anyway, so I might was well implement it).
Got the metadata saving implemented, and because we're using localforage it is stored as proper JS objects, which saves us going through the hoops of converting most JSON data again (the only thing we can't store are functions).
As a result, the Oligos All dataset (200k cells, 30+ MiB metadata) only takes four to six seconds to load once cached (the variation depends on which view - metadata is particularly slow for this dataset). Unchached that is close to fifteen to twenty-five seconds. And remember that this is when serving locally - on a slower connection this will be much worse!
Not having to download this data again every time someone opens the same dataset will also save bandwidth costs, although I don't know if that will be a significant amount.
Implemented gene caching over the weekend, so the basic infrastructure behind this is done.
Refreshing the Oligos Sparkline page with the twenty default genes before caching: 15 seconds. After caching: 3 seconds. It's also 52 megabytes in IndexedDB that we don't download again.
Those are numbers that make me happy ;)
This does suggest that heavy Loom usage can fill up IndexedDB pretty quickly, so we might want to allow manually cleaning the cache of datasets's metadata and/or genes. OTOH, browsers are free to empty IndexedDB if it fills up too much.
Ok, this is low, low priority, but I'm writing this down before I forget all of it again.
This sounds like a pretty perfect fit for our caching needs:
https://developer.mozilla.org/en/docs/Web/API/IndexedDB_API
Generally, the size limit is 1GB per website; plenty enough for regular usage. Given the simplicity of our scheme, the mentioned localForage is probably perfectly suitable for our needs
Alternatives: Web Storage, while having a much simpler API, is limited to 10MB, which we'll easily go over with after fetching just a few tiles/genes for one dataset. PouchDB is overkill, since it's an offline DB that can synchronise with an online DB, but we only download from the server, never upload.
Datasets, row/column data and genes
Right now everything is "cached" in JS objects, which last as long as the tab is open. Caching it in indexedDB would make this persistent, which is great for, say, long trips with unreliable internet.
Here is an explanation of how someone combined this with redux: http://stackoverflow.com/questions/33992812/how-to-integrate-redux-with-very-large-data-sets-and-indexeddb
Client-side it looks like we can "just" migrate the JS-based caching to localForage (treating that as "cold" cache, while keeping some of it in JS as "hot" cache). Wrap access to it in a bunch of reducer thunks (localForage is async and uses promises) and we're set! Much easier said than done of course, but the principle seems straightforward enough.
Exposing which data has been downloaded and cached, and allowing for manually clearing it is probably good too.
Server-side we'd need a way to signal that a loom file has been updated (for example, if we fix a bug in the pipeline or implement a better version of backSPIN). Just adding a simple cache-busting hash should do the trick, right?
Heatmap
Enhancing heatmap with indexedDB is also possible, and a completely separate piece of logic since the heatmap tiles are not (and should not) be stored in the redux store: https://github.com/tbicr/OfflineMap