mailgun / groupcache

Clone of golang/groupcache with TTL and Item Removal support
Apache License 2.0
484 stars 72 forks source link

HTTP Error Handling Problem #59

Open thrawn01 opened 1 year ago

thrawn01 commented 1 year ago

Proposal

When a requested value is not found in the local or hot cache within groupcache, the system performs a consistent hash on the key to determine which instance owns the requested value. If it determines that the value is owned by the local instance, it calls the GetterFunc() to retrieve the value. If GetterFunc() returns an error during the local call, groupcache propagates that error to the caller of group.Get(), allowing the caller to handle the error appropriately.

However, if groupcache determines that the value is owned by a remote instance, it makes an HTTP call to that instance which invokes GetterFunc() on the remote instance. If the GetterFunc() returns an error during the remote call, groupcache returns an http.StatusInternalServerError error to the caller. When this happens the calling instance of, groupcache logs this error (if a logger is set) and proceeds to fall back to calling GetterFunc() locally in an attempt to retrieve the value.

This situation is suboptimal for a few reasons

  1. In the case where the GetterFunc returns a not found error during a remote HTTP call, it makes no sense for the calling instance to fall back to calling GetterFunc locally which will likely result in also returning not found. Especially if the GetterFunc is retrieving the value from a common database.
  2. The actual error is lost when making remote calls.
  3. If any remote call has an error, then a local call will be made as a result. This could result in the duplicate GetterFunc calls exacerbating the underlying problem which caused the error.
Solution

groupcache should add an ErrNotFound error to the library. This error is provided such that implementors of GetterFunc can return this error to indicate the call failed to find the request the value. Remote calls via HTTP will reflect this error by returning http.StatusNotFound

All other errors will be returned via HTTP will be returned with http.StatusServiceUnavailable instead of the current http.StatusInternalServerError. This differentiates between an internal error (Something is wrong with groupcache internals) and an error returned by the GetterFunc.

The groupcache instance which made the HTTP call will look for these codes and avoid making a local GetterFunc call, and instead will propagate the error back to the caller with either ErrNotFound or ErrRemoteCall which contains the string value of the error returned by GetterFunc.