buttondown / roadmap

Buttondown's public roadmap
53 stars 0 forks source link

Proxy RSS reads through a static IP proxy like QuotaGuard #3551

Open jmduke opened 5 days ago

jmduke commented 5 days ago

THE PROBLEM: Cloudflare sometimes blocks our RSS fetches because we're a bot (which is fair — we are!).

THE SOLUTION: We register with them as a cool, chill, friendly bot

THE PROBLEM: To verify our incoming traffic, they need an allowlist of IPs — and Heroku doesn't give us sacrosanct ones!

This will let us submit Buttondown as a registered crawler to Cloudflare, which should obviate some issues we have with CF blocking our requests.

I think we can just use Fixie or some similar option (the overall bandwidth is pretty low); the HTTP call is in retrieve_items.

(We will also get to use this for the Validity stuff, if we need to!)

### Tasks
- [x] use a static IP to fetch blocked cloudflare sites https://github.com/buttondown/monorepo/pull/2005
- [x] respect robots.txt and add our static IP to docs https://github.com/buttondown/monorepo/pull/2017
- [ ] apply to cloudflare [verified bots](https://developers.cloudflare.com/bots/reference/verified-bots-policy/) and get approved
- [ ] check that it works
catdevnull commented 2 days ago

https://developers.cloudflare.com/bots/reference/verified-bots-policy/ A bot or proxy must have a minimum amount of traffic for Cloudflare to be able find it in the sampled data. The minimum traffic should have more than 1000 requests per day across multiple domains.

do we do this much traffic?

user-agent

ButtondownBot/1.0 or are we already using something else?

jmduke commented 2 days ago

do we do this much traffic?

yup!

user-agent

we are! and lightweight docs here: https://docs.buttondown.com/rss-to-email#troubleshooting, but can be expanded.

catdevnull commented 15 hours ago

okay, I submitted the bot. now we have to wait. I think they will email you @jmduke