Implement a endpoint request filtering algorithm based on a per client retry-budget. The algorithm is described within Site Reliability Engineering (Beyer et al.). The retry budget is based on a distributed HTTP response code time series (Redis). An error ratio is computed by means of an rolling window approach where status codes are constantly updated and periodically removed based on a configurable TTL.
Currently only endpoint HTTP status codes 500 and 503 are considered as errors (in future error status codes might be configurable, too).
Features and Changes:
HTTP response code time series with a URL netloc:path granularity (e.g. netloc=eida.ethz.ch:80, path=/fdsnws/station/1/query)
Configurable rolling window size
Configurable TTL for status codes within the response code time series; the TTL defines when requests should be forwarded to endpoints, again
Configurable per client retry-budget
Note: This PR implements a per client retry-budget, only. At the time being a per (endpoint-)request retry-budget is not used.
Drop requests to data centers (DC) not available.
Implement a endpoint request filtering algorithm based on a per client retry-budget. The algorithm is described within Site Reliability Engineering (Beyer et al.). The retry budget is based on a distributed HTTP response code time series (Redis). An error ratio is computed by means of an rolling window approach where status codes are constantly updated and periodically removed based on a configurable TTL.
Currently only endpoint HTTP status codes 500 and 503 are considered as errors (in future error status codes might be configurable, too).
Features and Changes:
netloc:path
granularity (e.g.netloc=eida.ethz.ch:80
,path=/fdsnws/station/1/query
)Note: This PR implements a per client retry-budget, only. At the time being a per (endpoint-)request retry-budget is not used.