Netflix / falcor

A JavaScript library for efficient data fetching
http://netflix.github.io/falcor
Apache License 2.0
10.48k stars 446 forks source link

Abstract cache interface? #809

Open sfrdmn opened 8 years ago

sfrdmn commented 8 years ago

Possibly a feature request, possibly a misunderstanding!

I'm trying to implement some offline-first functionality in a mobile app and so I'd like data to be retrieved from the cache as much as possible. Let's say for a subset of my data, I can give it an $expires value of 1 and only ever retrieve it from the remote once. This works great, until of course my app is closed and the in-memory cache is wiped. When I restart the app, the cache is empty, and I fetch the value once more from the remote.

I could periodically serialize the cache and persist it, then load it whenever the app boots up. However, this seems a wee bit hacky, like using a CRON job for replication or something. Since Falcor already allows for using Models as DataSources, it smells like there's a cleaner solution in chaining Model -> Model/DataSource [-> Model/DataSource] -> DataSource as a means of implementing a cache hierarchy. This already works if your hierarchy is something like Client -> Kinda close server -> Super far server since the default cache implementation makes sense there. However, if the intermediate DataSource represents a disk on the same device, I don't need another in-memory cache. Its cache should persist to disk incrementally and transparently, with misses proxying to the next DataSource. But currently, there is no abstract cache interface to implement and no means of supplying a Model with a custom cache implementation.

Is this something that's already possible? Planned? Or, does it make sense at all?

I guess this would have pretty hairy repercussions for manual cache invalidation. But $expires should work as before?

EDIT: er, I guess the cache is probably assumed to be a synchronous API, so this idea might not fly. Still, would be interesting to hear the reasoning behind the design decision

trxcllnt commented 8 years ago

@sfrdmn the cache was designed to be serialized (to disk, local storage, redis, etc.) to allow hydrating on startup if you want. You can pass an onChange handler in the props object to your root Model's constructor that will be called after any writes to the cache. Here's an example:

import { Model } from 'falcor';
const localStorageKey = 'my-falcor-app-cache';
const serializeCache = (key, cache) => localStorage.setItem(key, JSON.stringify(cache));
const deserializeCache =(key) => JSON.parse(localStorage.getItem(key) || {});
const rootModel = new Model({
  cache: deserializeCache(localStorageKey),
  onChange: function() { // <-- probably want to debounce this function
    if (!rootModel) { return }; // <-- onChange can get called before `rootModel` is bound
    serializeCache(localStorageKey, rootModel.getCache());
  }
});

onChange is a little noisy (a get/set/call operation might invoke onChange multiple times, depending on whether it has to retry, or if a call returns values/invalidations), so you'll probably want to debounce the handler.

Alternatively, I implemented an onChangesCompleted handler in my fork that only gets called after all pending cache operations are finished. My fork also has other benefits (stability improvements, critical bugfixes, better performance, friendlier error messages), that haven't or won't make it into the official release, so feel free to experiment and contribute there if you have any ideas:

// set "dependencies": { "falcor": "trxcllnt/falcor#onChangesCompleted" } in your package.json
import { Model } from 'falcor';
const localStorageKey = 'my-falcor-app-cache';
const serializeCache = (key, cache) => localStorage.setItem(key, JSON.stringify(cache));
const deserializeCache = (key) => JSON.parse(localStorage.getItem(key) || {});
const rootModel = new Model({
  cache: deserializeCache(localStorageKey),
  onChangesCompleted: function() { //    `this` is bound to rootModel
    serializeCache(localStorageKey, this.getCache()); 
  }
});

If you want more granular control over when you serialize the cache, you could implement the DataSource API that acts as client-side caching middleware that serializes the cache state as desired:

import { Observable } from 'rx';
const localStorageKey = 'my-falcor-app-cache';
const serializeCache = (key, cache) => localStorage.setItem(key, JSON.stringify(cache));
const deserializeCache = (key) => JSON.parse(localStorage.getItem(key) || {});

const onlineDatasource = new falcor.HTTPDataSource('/model.json');
const localRootModel = new Model({ cache: deserializeCache(localStorageKey) });
const remoteRootModel = new Model({
    ...localRootModel, // <-- ensures remoteRootModel will use the same cache as localRootModel
    source: cachingMiddleware(localRootModel, onlineDatasource)
});

function cachingMiddleware(model, datasource) {
    let requestCount = 0, cacheInterval = 10;
    return {
        get(...args) {
            return cacheAfterComplete(datasource.get(...args));
        },
        set(...args) {
            return cacheAfterComplete(datasource.set(...args));
        },
        call(...args) {
            return cacheAfterComplete(datasource.call(...args));
        }
    };
    function cacheAfterComplete(operation) {
        return Observable.create(function subscribe(observer) {
            return operation.subscribe({
                onNext: observer.onNext.bind(observer),
                onError: observer.onError.bind(observer),
                onCompleted() {
                    // Tell the Model the operation is done.
                    // The Model will write the results into the cache synchronously,
                    // so they will be there when we call `getCache` below.
                    observer.onCompleted();
                    // Custom logic -- only serialize after every 10th request
                    if (++requestCount % cacheInterval === 0) {
                        serializeCache(localStorageKey, model.getCache());
                    }
                }
            });
        });
    }
}
sfrdmn commented 8 years ago

@trxcllnt Thanks for the examples!✌️

I guess my only beef with the method is that it feels like extra work and I feel so close to getting the same persistence orthogonally using only the abstractions Falcor already provides. I can see the use of serialization/hydration for perf gains, but it doesn't smell right for implementing the persistence itself.

Basically, I want my application to only think about the JSON Graph. Whether something is cached, where it's cached, where it comes from, etc etc, is irrelevant. You might say that property is fulfilled by this method, but then it would still be fulfilled supplementary to Falcor's "normal" execution. Instead, I could take advantage of Falcor's aggressive caching policy and forget about persistence altogether. There is one JSON Graph as a function of one or more data sources. A subset of the graph may be cached by intermediaries. By abstracting the cache interface, I can create cache hierarchies, and implement local persistence incidentally, as a consequence of whatever cache policies I set. I also have the added power of persisting incrementally/atomically and in any arbitrary format.

Right now it seems like this can be accomplished by implementing a data source which composes two other data sources: one representing a cache, the other representing the origin. It would need to implement its own caching logic to proxy cache misses to the origin, etc. This is not super ideal though, since I don't want to reimplement cache invalidation stuff. Still, I might try to code up an example for it.

Also I could imagine, though, the idea might be out of scope for Falcor. You'd probably need some explicit concept of cache hierarchies, a protocol for communicating invalidation down the hierarchy, etc? Maybe it's something which wants to be a framework on top of Falcor