ralexander-phi / rss-blogroll-network

https://alexsci.com/rss-blogroll-network/
Apache License 2.0
13 stars 1 forks source link

Add website: ttntm.me #5

Closed ttntm closed 4 months ago

ttntm commented 4 months ago

I'd like to include my website in the RSS Blogroll Network crawler.

My RSS feed is located at: https://ttntm.me/everything.xml

(just the blog: https://ttntm.me/blog/feed.xml)

I have:

ralexander-phi commented 4 months ago

Hi Tom, I'd be happy to add your website.

However, it looks like your robots.txt is blocking the crawler. It starts with:

User-agent: *
Disallow: /everything.xml
Disallow: /blog/feed.xml

Can you update your robots.txt to allow our user-agent? Adding something like this to the bottom should work:

User-agent: Feed2Pages/0.1
Disallow:

I'd like to respect all robots.txt settings, so I won't be able to crawl the site as-is.

ttntm commented 4 months ago

@ralexander-phi oh shoot, sorry, I don't know how I missed that.

I've just whitelisted the crawler, thanks for the hint!

ralexander-phi commented 4 months ago

Great, I've added your site here: http://localhost:1313/discover/feed-10420918feef664f327101723c6ac281/

ttntm commented 4 months ago

Thanks 🙏