Closed michaelbromley closed 4 weeks ago
We should add the support for cache tags. In many scenarios you want to delete items in the cache for a certain namespace, e.g. delete all cached values for a product or a zone. Pimcore has a neat implementation, where we can get some inspiration from: https://pimcore.com/docs/platform/Pimcore/Development_Tools_and_Details/Cache/#overview-of-functionalities
I would also recommend that we take a look at the caching architecture of Symfony as it is a really sophisticated one: https://symfony.com/doc/current/components/cache.html#generic-caching-psr-6
Tags are a mechanism of grouping cache items in order to make it possible to invalidate all items based on tags.
https://symfony.com/doc/current/components/cache/cache_invalidation.html#using-cache-tags
// invalidate all items related to `tag_1` or `tag_3`
$cache->invalidateTags(['tag_1', 'tag_3']);
// if you know the cache key, you can also delete the item directly
$cache->delete('cache_key');
In the Symphony (& PSR-6 in general) implementation, cache items are wrapped into a CacheItem
class, which also allows tags to be set on the item:
// add one or more tags
$item->tag('tag_1');
$item->tag(['tag_2', 'tag_3']);
https://laravel.com/docs/11.x/cache
Laravel had a tags implementation but it was recently removed (at least from the documentation):
It looks like their use of tags was badly designed - you can only invalidate by tags when the array of tags exactly matches. explanation
https://www.drupal.org/docs/drupal-apis/cache-api/cache-tags
Any cache backend should implement CacheBackendInterface, so when you set a cache item with the ::set() method, provide third and fourth arguments e.g:
$cache_backend->set(
$cid, $data, Cache::PERMANENT, ['node:5', 'user:7']
);
This stores a cache item with ID $cid permanently (i.e., stored indefinitely), but makes it susceptible to invalidation through either the node:5 or user:7 cache tags.
A package from Max Stoiber that implements a very simple (1 file) Redis cache with tags. We can use this as inspiration for our Redis version.
https://www.npmjs.com/package/redis-tag-cache
This implements the solution given in this SO answer using a separate list of keys for each tag and then smembers
: https://stackoverflow.com/a/40649819/772859
The consensus design is that any cache item can be tagged with one or more string tags. Later you can invalidate by tag and all entries that have that tag will be invalidated.
We need to have 3 concrete implementations:
The common structure for tags will be to have a separate data structure that stores the tag with a list of keys that have that tag.
For the in-memory store, this can be a Map<string, Set<string>>
- a map with the tag as the key, and a set of corresponding cache keys as the value.
The redis implementation is similar, and can be seen in the Redis-tag-cache package above.
For the database store, we would need a separate table to store entries associating a tag with a single cache key:
CREATE TABLE cache_tags (
id SERIAL PRIMARY KEY,
tag VARCHAR(255) NOT NULL, -- Tag name
cache_key VARCHAR(255) NOT NULL, -- Corresponding cache key
FOREIGN KEY (cache_key) REFERENCES cache_items(cache_key) ON DELETE CASCADE
);
The Problem
There are several places where Vendure core makes use of caching in order to significantly improve performance:
The issue with those solutions is that they are in-memory-only, and therefore local to the specific server/worker instance.
Being in-memory has two major downsides:
This second point is the reason we use TTLs with relatively short times. Again, leading to more work being done on the database.
It is also the reason we do not use caching in other scenarios that could radically improve performance: we cannot reliably invalidate a cache that is not shared by all instances.
Example
This issue was originally motivated by an investigation I am conducting into the performance of the order-related mutations. Using a prototype of this caching approach, I was able to speed up my benchmark by ~2.5x and cut the p(95) response time from 6.98s to 3s.
Proposed Solution
I propose introducing a shared caching mechanism into the core:
CacheStrategy
. This will be strategy-based allowing you to decide whether you want to store that cache in:The
CacheStrategy
would replace all existing caching mechanisms mentioned above, and would unlock the opportunity to make huge performance gains in currently slow areas like:Because the cache is shared, it means as soon as one instance has cached a value, it will be available to all instances.
Design
At the most basic, the
CacheStrategy
will implement the typical cache methods:get()
add()
,delete()
.It should also support key eviction via TTL which would be configurable per key.
It should be able to store JSON-like data, i.e. any serializable JS data structure, just like we already support with the job queue.
Here's a sketch of how it would look:
Backward Compatibility
The implementation of
CacheStrategy
needs to be done in a backward-compatible way, so no changes are needed by the user when upgrading.createSelfRefreshingCache()
function andTtlCache
class would be deprecated, and internally their usage would be replaced withCacheStrategy
InMemoryCacheStrategy
which duplicates the current behaviour.SqlCacheStrategy
which stores the cache in a key-value table in the database, using a JSON sql type for the value.Summary
This proposal has the following benefits: