Penalty system for bad responses

The base idea is to penalize requests which fail, for example time outs or api limits. A "scoring" would be implemented to decide on how much to penalize requests which fail, depending on the severity of the fail.

As a terminology for that behaviour [of throttling requests], i will refeer to it as a "Cool Down" from this point on, in this document.

My suggestions/Examples of bad requests/responses:

API limits (like for example on BTN, RARBG and Newznab)
HTTP code 429 (A sub of the above, as that's what many newznab indexers return)
Time-Outs
Bad HTTP status codes (All of the 5xx series, as they are server "issues", and 403 [which should be a huge Cool Down])
Unreacheable adress (can be user connectivity issues)

The Scoring system

Each of the previous bad requests has a penalty assigned depending on their severity, an api limit hit/HTTP 429 are more severe than a time out.

Each "domain" starts with a base score of let's say 0 ,after each bad request (score which gets reset every set time, every day? week? [needs suggestions]) they get the penalty score value added to it's score.

If a domain has a consequetive bad response, it would end having double the penalties score added (in example: a time outs score is, lets say, 5, If a "domain" hits a time out it gets a score of 5 added to it's score, if it then gets another time out it would then get a score of 10 added to it's own value.

What Needs/Would Benefit this?

Newznab indexers (more notably, api limits)
API torrent sites (same as previous one, api limits)
Torrent sites (Time Outs, 5xx codes, etc)
XEM (if requests fail, it keeps hammering in which is no bueno)
Show Indexers? Don't know if we want to do that in a short time, may be a too big of a task

I suggest starting with Newznab and maybe torrent sites or just api torrent sites.

Retry-After

Respect the Retry-After HTTP header, doesn't need much more of a description.

https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.37

@labrys this is something you wanted in medusa for a long time, and i know you had good ideas. Feel free to edit this post (same goes for everyone else) and make comments, of course in a constructive manner.

@p0psicles as talked on IRC, please make questions if you don't understand anything :)

@pymedusa/developers

IMO the best way to implement this is with a combination of session options and a CBP (circuit breaker pattern). The session options would be things like header inspection of attributes such as `X-RateLimit-*'.

For example my TMDB session I wrote for my db validation scripts has 3 rate limiting functions I tried out, each with their own advantages and disadvantages.

The first throttles requests to maintain an average rate of just under 4 requests / second.
The second does no throttling until you reach your rate limit, at which point it blocks until the reset time ('X-RateLimit-Reset').
The third combines the two methods and doesn't throttle until you have used a certain percentage of your rate limit during the current period, then it throttles heavily to reduce requests until the rate limit expires.

The advantage of the first is that you never exceed the rate limit, at the expense of a sleep after every single request, even if you only plan on making a small handful of requests that would never exceed the limit.

The advantage of the second is that you never get blocked unless you are real close to exceeding your limit, the disadvantage being that (particularly if multi-threaded) you could exceed the limit and get a 429 response that would then have to be handled to avoid dropping calls.

The advantage of the third is that you should never hit your limit (even when multi-threaded) and small handfuls of requests will never be throttled, but throttled requests are throttled much heavier than option 1.

Each has its own uses-cases. The nice thing is you can switch them in and out by just changing the response hook. For example for a user request you might prefer the second option.

Now contrast this with the CBP style. A CBP catches exceptions thrown by your app and can respond to them in several ways.

For example given a web service like a provider... you could raise for status on every request exception (404, etc...). The CB would catch the exceptions and "trip" when certain exceptions occur. Until that CB resets, that provider would be disabled.

The CB could be soft-open or hard-open.
Hard open means that the circuit will not reset on its own.
Soft open means it will reset given some condition.

An example for hard open might be Authentication Error. If authentication fails, you would raise an InvalidLogin exception and the provider would be disabled until the user took action (for example restarting Medusa or changing Login credentials).

An example for soft open would be an exceeded rate limit. When the rate limit is hit, the ExceededRateLimit exception would be thrown... (e.g. http status 429 for tmdb). The CB would trip until the Reset Time passed and automatically reset.

A third more involved example... during provider parsing, a provider might throw a failed parsing error. After say 10 failed parses without a success, the provider could be disabled for an hour. After resetting, Say another 10 parses failed without a success. The provider could be disabled for a day. After a third round of exceeding the failure threshold without success the provider could begin saving the failed responses. After a fourth round of failures it could auto-report the failure to GitHub and the CB trip hard-open for the provider until the user reactivates it or restarts. A large number of reports of a failure could then allow us to investigate if a site is down, the format has changed, etc.

Similar checks could be done if certain parts of a parsed result fail validation repeatedly (e.g. IMDB ID's that don't start with tt, TVDB ID's that aren't numeric, Dates that fail parsing, etc).

The CBP alllows for a much more powerful, configurable and flexible system than simple session configuration but at the expense of more coding effort. However the coding would be significantly simpler than repeating the same exception handling inside every function especially since those would generally act on the instance of an exception with no memory or coordination with other error conditions. At its most simple it could just trip on any exception for a short period before resetting.

@labrys without going into solutions, can you please state your requirements? Your Already going into implementation while I'd like to refine the request first.

First of all I think the base requirement was fairly straightforward and did not require elaboration, secondly without first discussing common use cases and presenting ideas (the purpose of my post) as part of a brainstorming session a requirement list has little value. Stakeholders have to understand a problem before they can intelligently generate requirements. Otherwise you may end up with a requirement list that bears no resemblance to what's actually needed. But to explicitly state the base requirement:

Avoid making invalid/unnecessary requests a.k.a Be a better web citizen

However if you want additional candidates for a requirement list (most of which I already mentioned or alluded to in my previous post)....

Requirements:

Avoid hitting resources excessively (e.g. if resource not responding do not continue to hammer it, rate limit, etc)
Use metadata provided (headers, status codes, content, robots.txt, etc) for more intelligent request strategies
Automatically disable (temporarily or permanently) failing services (regardless of if it's a parse issue or a provider)
Maintain a responsive UI even during periods of heavy requests
Automate data gathering for unexpected failures (e.g. save raw responses for failing providers)
Implement opt-in automated error reporting
Optimize request strategies (e.g. don't do mass updates of shows, mass searches for subtitles, etc... instead do periodic small batches based on last successful update... get list of updated series from indexers and only update those necessary instead of scraping all.... reduce frequency of updates for shows that are no longer active... etc)
Prioritize requests triggered by direct user action (e.g. show update, manual snatch, etc)

and I could go on, but the more requirements put out there, the more it becomes a list of options and ideas for solutions than actual requirements.

TLDR: They all refine down to the single requirement stated in bold above.

From Slack.

Alright. Basically I have a limit of 2.5k I want at least 100 api calls saved for daily so new episodes get snatched and Medusa can use the other 2.4k for whatever it feels like. I'd like this to reset at midnight or something. I don't know when dog resets exactly.

pymedusa / Medusa