redlib-org / redlib

Private front-end for Reddit
GNU Affero General Public License v3.0
1.33k stars 90 forks source link

🐛 Bug Report: deal with 403 blocked gracefully #217

Open RayBB opened 1 month ago

RayBB commented 1 month ago

Describe the bug

Recently the hetzner IP I've been using for redlib was blocked. When I ssh into my VPS I see this:

wget -qO- https://www.reddit.com/
wget: server returned error: HTTP/1.1 403 Blocked

If I open reddit when proxying through that VPS I see this:

Spoiler ``` Blocked

whoa there, pardner!

Your request has been blocked due to a network policy.

Try logging in or creating an account here to get back to browsing.

If you're running a script or application, please register or sign in with your developer credentials here. Additionally make sure your User-Agent is not empty and is something unique and descriptive and try again. if you're supplying an alternate User-Agent string, try changing back to default as that can sometimes result in a block.

You can read Reddit's Terms of Service here.

if you think that we've incorrectly blocked you or you would like to discuss easier ways to get the data you want, please file a ticket here.

when contacting us, please include your ip address which is: 111.111.111.111 and reddit account

``` However, redlib only shows `404 page not found` and I also don't see any logs about this. Logs just say ``` 2024-09-08T16:54:36.989147736Z Starting Redlib... 2024-09-08T16:54:37.813015096Z Running Redlib v0.31.0 on 0.0.0.0:8080! ```

Steps to reproduce the bug

You can't really reproduce it without having a blocked IP :)

What's the expected behavior?

At least log that it's blocked so it's easy to see the issue. Even better if a warning can be shown on the 404 page.

Thanks for all your hard work on this awesome tool :)

sigaloid commented 1 month ago

Do you happen to have what the specific error was? There should be a "report issue" button which includes the filled-out template that has more specific error messages.

RayBB commented 1 month ago

@sigaloid I only have what I shared. The browser only shows this.

image

Is there another place I can look to get logs that would be helpful?

sigaloid commented 1 month ago

That specifically showing up is very strange because nothing Redlib has will generate that error page. Is there a reverse proxy?

sigaloid commented 1 month ago

Nonetheless, I believe this was coincidentally fixed in 0b15250cc83776d48c6247c553d40adeb79ac9cf. It should render a proper error page mentioning a "rate limit" - this language is a bit vague since technically it could be an IP ban like in your case. But there's nothing I can do to detect that you're specifically IP banned, other than seeing an error page on every single request.

sigaloid commented 1 month ago

Reopening this - what is the behavior with the latest container image? I removed the special case in the commit above since it wasn't correct but I do want to specifically handle this case better since I get a lot of bug reports seemingly consisting of an IP ban of some sort

RayBB commented 1 month ago

I realized when I ssh into the docker container and curl the redlib service it is fetching posts successfully (on latest and before).

Unfortunately, I'm behind Traefik and redlib now mostly says Gateway Timeout when I make a request. Though sometimes it doesn't. When I check the logs with debug mode on it seems fine.

2024-09-30T17:57:54.538562110Z  INFO  redlib::oauth               > [✅] Success - Retrieved token "eyJhbGciOiJSUzI1NiIsImtpZCI6IlNI...", expires in 86399
2024-09-30T17:57:54.538568630Z  INFO  redlib::oauth               > [✅] Successfully created OAuth client
2024-09-30T17:57:54.538574390Z  INFO  redlib::oauth               > [⏳] Waiting for 86279s seconds before refreshing OAuth token...
2024-09-30T17:57:54.538590470Z Running Redlib v0.35.1 on [::]:8080!
2024-09-30T17:59:24.538795827Z  DEBUG rustls::common_state        > Sending warning alert CloseNotify

Unfortunately, there must some other error not related to 403s that I haven't quite pinned down. So I don't think I can help much with the 403 case anymore.