Closed amarand closed 4 years ago
Thanks for logging your issue!
You're right, generally a 5xxx error indicates the server is misconfigured so it would be typical to assume retrying 'soon' won't work. However I can see your use case and your proposed solution is the right way to do it. Something like --always_retry
or something like that. We kind of have a similar functionality already for network errors (i.e. at your end). So we'd need to control two variables: how long between attempts, and now many attempts before the script gives up. Would you expect control over the second? I'd probably default to 5 attempts.
If I could control the former (duration between attempts, in minutes), a fixed five attempts would be fantastic. In my use case, I would just set the timeout to, say, ten minutes or something, and that would give me 50 minutes worth of retries, which should be more than enough for the majority of the "maintenance" outages I see with my unfederated instance. (Thank you so much!)
Hey @amarand I've started looking at this but I now realise I don't know which Mastodon error this is throwing. When your server is returning a 521 error, do you remember what ephemetoot
currently does with that? i.e. is it:
π‘ ephemetoot cannot connect to the server - are you online?
or
π
User and/or access token does not exist or has been deleted
or something else?
Usually itβs the βUser and/or access token does not exist...β error. With the instance I use, the 500-series is set when the admin is doing short Maintenance. Usually restarting in a few minutes works. My concern is, if I start it at, say, 2000, and it runs until 2100, then fails, nothing happens overnight. So any back-off (15/30/60 minutes) is better than an outright failure to the command line.
Oh, and I realized the reason why I never see the "π‘ ephemetoot cannot connect to the server - are you online?" error is that the instance I use, goes through Cloudflare, so the front-end connection never fails ("are you online?") but the authentication/token on the back-end is rejected (because Cloudflare isn't passing anything other than a "failure" message). Hope that clarifies?
Ahh, here we go...found one from today:
π ERROR deleting toot - 100923825263144264 - ('Mastodon API returned error', 522, '', None) Waiting 1 minute before re-trying Attempting delete again π ERROR deleting toot - 100923825263144264 ('Mastodon API returned error', 500, 'Internal Server Error', None) Exiting due to error.
π User and/or access token does not exist or has been deleted
Perfect thanks!
Mastodon.py
provides different error codes for different types of error so I just need to make sure I'm using the right one. This is tricky to test because I need to emulate a 5xx error without, you know, shutting down my own site.
Does your proposal relate to...
Is your feature request related to a problem? Please describe. A clear and concise description of what the problem is. e.g. I'm always frustrated when [...]
My Mastodon service provider sometimes throws his server into a "maintenance mode" and it throws a 521 error. It's semi-common for him to do this. It looks like ephemetoot gives up (which might be required for the protocol for a 500/521 error?) but is there a way to either A) add a switch that allows you to keep retrying (after a certain safe wait period, maybe a few minutes?) after getting a 500 error or B) just build that in without a switch (possibly with a switch to override that behavior)?
Describe the solution you'd like A clear and concise description of what you want to happen.
Ideally, this would be an opt-in switch, because I think 500 errors might not be retryable by convention. But the switch would, upon receipt of a 500 error, back off, wait a certain amount of time (5 minutes? 10? 15? Possibly set at the command line with a default?) and then retry.
Would like to write the code yourself?
Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.
When I see that it throws this error, I just restart the script and it runs usually right away. If I don't get to it for a few hours, I miss a few hours worth of deletions.
Additional context Add any other context or screenshots about the feature request here.
Thanks!