subsquid / squid-sdk

The main repo of the squid SDK
GNU General Public License v3.0
1.21k stars 147 forks source link

typeorm: typeorm-store rework #293

Open belopash opened 4 months ago

abernatskiy commented 3 months ago

This introduces a global cache. It's a welcome change that makes custom caches - a major source of bugs and complexity - unnecessary, but IMO it requires an interface rework.

The key here is that a cache hit is so much cheaper than a cache miss that the user must be constantly aware of the cache contents to get good performance. The current interface is an attempt to make working with cache transparent, which makes it very easy to write code that performs poorly.

Here are some examples:

  1. The updated store exposes the following methods:

    ...
    get(e: EntityType, id: string | options)
    find(e: EntityType, options)
    findBy(e: EntityType, options)
    ...

    get() checks the cache, while find*() methods do not and are always expensive to call. The only place where the user might learn that is documentation/docstrings, and the cost for using the wrong method is high.

  2. get() calls can be expensive too, and it's difficult to predict that now. Sometimes that can be deduced from the call: e.g. a get(Entity, {relations: {someOneToManyField: true}} is always a cache miss, and therefore expensive. But most cache misses are due to just not knowing what's in the cache right now.

  3. The store never warns the user about cache misses. It's also very easy to write a batch processor where every get() is a cache miss. This will perform as bad as a subgraph, and do so silently.

Recommendations:


Other notes:

  1. flush() and reset() methods are named in a counterintuitive fashion: flush() in a caching module is not a cache flush, but a DB-related operation. reset() does what 99% of users will expect flush() to do.

  2. The new store behavior settings are hard to interpret. Would be nice to simplify them somewhat.