ircv3 / ircv3-specifications

IRCv3 specifications | Roadmap: https://git.io/IRCv3-Roadmap | Code of conduct: http://ircv3.net/conduct.html
http://ircv3.net
785 stars 79 forks source link

Add the crawler-preference spec. #560

Open SadieCat opened 2 weeks ago

SadieCat commented 2 weeks ago

Rendered link.


This is a skeleton for an idea I've had recently. I'm fully expecting this to require revisions and expansion before its production ready so please feel free to propose changes.

An alternate solution I was considering was advertising a plain CRAWLER token and then bots can detect that execute a CRAWLER <name> command and get back a response about whether that specific crawler is allowed on the network. I'm not sure if that overengineering things though.

Problem

Its very hard to find IRC channels because there's no useful comprehensive database of channels. A few exist (i.e. netsplit) but they rely on admins manually adding them which isn't great.

Its possible to crawl the entire address space for networks (and IRCStats currently does this) to collect data but many IRC admins have historically resisted making that information public for privacy reasons.

Solution

This specification adds a way for networks to declare that they are okay with bots crawling them. It also allows them to specify how often they'd like to be crawled. This allows networks with privacy concerns to opt-out of scanning.

I've put a WIP module with support for this on the InspIRCd Testnet (testnet.inspircd.org).