Cache result interval differs from documentation

natm commented 1 year ago

Describe the bug Cached results appear to be generated every 30 minutes instead of 15.

To Reproduce The API documentation states:

Cached Responses
Any request that does not require lookups will be served a cached result. Cache is updated approximately every 15 minutes.

You can spot cached responses by checking for the "generated" property inside the "meta" object.

When examining the generated epoch timestamp for each aspect, these appear to being produced every 30 minutes and not 15.

Screenshot 2023-01-11 at 14 24 01

Expected behavior Cached results for non filtered queries with a depth of 0 are produced every 15 minutes.

Additional context We have multiple internal apps and services which use the PeeringDB API interactively, all of these query networks and organisations by ID. As the usage of these applications increase we find ourselves regularly hitting the API throttling limits. To provide a better experience we have stood up an in-memory PeeringDB mirror which uses the cached full dataset, once internal apps are migrated to this we will be making a fixed amout of requests to the API in the given period.

Having the cached datasets produced every 15 minutes would improve the experience when working with this data, for example when asking prospective peers to make sure their PeeringDB data is upto date before we turn up IX sessions.

Thanks.

grizz commented 1 year ago

@natm odd, we can and will certainly check on that.

The cached datasets are for downloading the whole database. You local mirror would be better served by using the ?since= parameter to only download changes since you last queried, which easily and efficiently supports syncing every couple minutes.

The python client will do that for you with peeringdb sync as well if your in memory database something supported by Django.

Let us know if you would like any help getting that working. I think the docs are pretty good. but they certainly could need improvement.

natm commented 1 year ago

The cached datasets are for downloading the whole database. You local mirror would be better served by using the ?since= parameter to only download changes since you last queried, which easily and efficiently supports syncing every couple minutes.

Yep, I'm aware of the since param, we push the whole dataset it into Hollow and then let it handle deltas which are distributed to subscribing nodes/apps, each app can be notified if particular fields have changed as well as row.

Other items that may improve integration for others:

A mechanism for API consumers to be aware the behaviour/schema may have changed; including your release version as a HTTP header or in the meta JSON blob.
Exposing API auth status, e.g. authenticated vs unauthenticated via a HTTP header / meta blob. Consumers are blind to auth state at the moment, I know it throws a JSON failure for an incorrect key, but not if the header is sent wrong.
In the OpenAPI docs many of the boolean fields are listed as string.

Thanks for looking at the generation interval.

grizz commented 1 year ago

API cache generation is taking ~25 minutes now. We're currently looking at improvements to this with #1065

We'll see if we can get something updated for this months release.

peeringdb / peeringdb

Cache result interval differs from documentation #1301