CFC-Servers / gm_express

An unlimited, out-of-band, bi-directional networking library for Garry's Mod
https://gmod.express
GNU General Public License v3.0
68 stars 4 forks source link

Recipients randomly receiving 404's on GET #33

Open brandonsturgeon opened 1 year ago

brandonsturgeon commented 1 year ago

This is a really annoying issue that, according to Cloudflare's docs, shouldn't be happening.

The Problem

Cloudflare KV docs state:

When you write to KV, your data is written to central data stores. It is not sent automatically to every location’s cache

Initial reads from a location do not have a cached value. The data must be read from the nearest central data store, resulting in a slower response.

So what they're saying here is that KV achieves low latency by caching KV lookups. On the first read from a given location, it'll get a cache MISS and have to traverse the Cloudflare nodes to find the actual value, and then it'll cache that value to the location where the lookup occurred.

This is fine. Good, even - we can deal with that added latency for a single client.

A quick refresher on how Express works: Our current flow is something like this:

  1. Sender uploads their data to Cloudflare
  2. Express Service generates a UUID and uses it as the key for the Data while saving to KV
  3. Express Service replies with 200 and the UUID
  4. Sender sends a net message to the recipient containing the UUID
  5. Recipient receives the net message, reads the UUID, and makes a GET to Express Service for that UUID
  6. Express Service asks KV for that UUID, returns the Data to the Recipient

However, what we're actually seeing, is the Recipient asks Express Service for the UUID, and gets a 404!

How does that work? It's not possible for it to actually be a 404 at this point because the data definitely exists in KV. We wouldn't have the UUID unless it were already stored!

The worst part about this bug is that this 404 is actually cached for that location! Meaning anyone else trying to read the Data for that UUID just gets a cached 404 🤦‍♂️


The Solutions

Assuming this isn't some obscure bug with the Express Service (I suppose it's possible), there are a few different ways to tackle this.

1. Have clients retry for the data _(Proposed in https://github.com/CFC-Servers/gm_express/pull/34)_ We could just make the client ask for the data over and over until the cache busts? 😅

2. Have the sender wait a brief moment before sending the net message with the data's UUID _(Proposed in https://github.com/CFC-Servers/gm_express/pull/34)_ If it really was a timing issue, perhaps delaying the net message by even a few fractions of a second could increase the chance of a successful first GET. This would be a very minor delay, but it would increase the minimum time of every message by that much.

3. Create a comprehensive cross-region, bare-bones example demonstrating this as a Cloudflare bug and ask Cloudflare to fix it This should probably be done, if at least just to confirm that the bug is Cloudflare's like I'm describing it... But man, that's a lot of work.

brandonsturgeon commented 1 year ago

As an extension to this issue, it seems that Express isn't managing its cache correctly.

I notice on my long-running server, even with retries and send delay, sometimes data that is cached serverside isn't able to be downloaded. This means that the addon isn't expiring its cache as quickly as the items actually expire on the Express Service.

In my testing, the Retries+Send Delay seems to work as expected. I'll fix this caching bug and then tag a new release.

brandonsturgeon commented 1 year ago

This problem appears to have subsided with the merge of #34 👍