Closed divergentdave closed 5 months ago
FWIW, I saw the following error a few times after testing a large flux of instances in and out of a deployment. This can happen due to a race condition when one instance deletes a stale entry in the window between when another instance lists and reads the entry. This error message is still useful for identifying real problems with a storage backend, so I think the best option for now is to leave it as-is and accept the log noise. Depending on the storage backend, the ideal way to fix the noise would be to list and read the entries in one transaction, or to skip listing up-front, and iterate over all entries using some sort of cursor. That will have to wait for a v2 of the Storage interface.
{
"level": "ERROR",
"timestamp": "2024-06-12T15:23:38.418344685Z",
"logger": "http.handlers.rate_limit",
"msg": "unable to load distributed rate limiter state",
"key": "rate_limit/instances/7303b16e-44dd-4dcf-b4dd-d6134d2ba788.rlstate",
"error": "file does not exist"
}
This adds a new optional configuration parameter, to allow deleting stale distributed rate limiter state. Closes #50.
I have manually tested this locally, next I plan to add unit testing.