[Argus] Improves 'HEAD /assets/{id}' requests latency by implementing caching QN requests

zeeshanakram3 commented 1 year ago

addresses #4814

This PR implements max-age based cache strategy for StorageDataObject QN entity, whose corresponding data object asset is missing on the distributor-node & for which a QN request is need to be made.

So whenever a HEAD /assets/{id} request is made, it could be satisfied in the following ways:

If the object, for which a QN request is supposed to be made, does not exist in the Apollo's in-memory cache, the network call would be made to fetch the object and update the cache.
If object exists in the in-memory cache and is older than the CACHED_OBJECT_MAX_AGE value, a network call would still be made to fetch the object and update the cache.
If object exists in the in-memory cache and is NOT old then CACHED_OBJECT_MAX_AGE, the object would be served from the cache

mnaamani commented 1 year ago

I have to say I was surprised that there is no builtin cache expiry config, and I presume that is why you had to add way to track the age of cached object.

@zeeshanakram3 Found a third party cache implementation for apollo client, is it something we can use to make our code a bit cleaner? https://github.com/NerdWalletOSS/apollo-cache-policies

zeeshanakram3 commented 1 year ago

I have to say I was surprised that there is no builtin cache expiry config, and I presume that is why you had to add way to track the age of cached object.

@zeeshanakram3 Found a third party cache implementation for apollo client, is it something we can use to make our code a bit cleaner? https://github.com/NerdWalletOSS/apollo-cache-policies

Thanks for sharing, will test it.

zeeshanakram3 commented 1 year ago

I reimplemented & tested apollo-cache-policies npm package (using OpenTelemetry traces) and the TTL seems to be working as expected.
Also made the TTL configurable using interval.queryNodeCacheTTL
Bumped package version + added CHNAGELOG.md

mnaamani commented 1 year ago

Works as expected. To test I commented out the code that checks for file in local fs before trying query:

  // distributor-node/src/services/content/ContentService.ts
  public async objectStatus(
       objectId: string,
       qnFetchPolicy: QueryFetchPolicy = 'no-cache'
  ): Promise<ObjectStatus> {
    /*
    const pendingDownload = this.stateCache.getPendingDownload(objectId)

    if (!pendingDownload && this.exists(objectId)) {
      return { type: ObjectStatusType.Available, path: this.path(objectId) }
    }

    if (pendingDownload) {
      return { type: ObjectStatusType.PendingDownload, pendingDownload }
    }
    */
   const objectInfo = await this.networking.dataObjectInfo(objectId, qnFetchPolicy)
...

then I tested (crudely)

# first request ~ 400ms
time http HEAD http://localhost:3334/api/v1/assets/1
docker stop graphql-server
# confirm we get a cached response, ~200ms response
time http HEAD http://localhost:3334/api/v1/assets/1
.. try a few more times, until we get an error response because distributor tried to do a query to QN
# restart query-node
docker start query-node
time http HEAD http://localhost:3334/api/v1/assets/1
# successful request as expected.

Tried with different TTL values also.

Joystream / joystream

[Argus] Improves 'HEAD /assets/{id}' requests latency by implementing caching QN requests #4834