sopel-irc / sopel

:robot::speech_balloon: An easy-to-use and highly extensible IRC Bot framework. Formerly Willie.
https://sopel.chat
Other
948 stars 403 forks source link

url.py anti-malware/anti-phising #444

Closed elad661 closed 9 years ago

elad661 commented 10 years ago

When encountering a bad URL, the bot should warn everyone in the channel not to click it.

Use the following services will allow implementation of this feature: Google SafeBrowsing, VirusTotal, MalwareDomains.

Due to the privacy concern, a channel-based setting will allow channel owners to disable this feature. But it should be on by default, especially for public channels, as it can protect users from malware. Auto-banning/Auto-kicking of users who post blacklisted URLs constantly might be useful as well.

MalwareDomains and Google both work with a local list of bad domains/URLs. VirusTotal is query-based only. Local list is obviously better for privacy reasons. Google also has a query-based API for developers who don't wish to implement the more complex local-database based API.

lramati commented 10 years ago

Should we use google just because it opens the door to a dual implementation for those who don't have space for the local db?

elad661 commented 10 years ago

I thought of using all 3 services. Google's API tells you exactly how to download and sync the bad-URLs db. The only service out of the 3 that doesn't support querying is malwaredomains and that can be fixed by writing some server-side wrapper to it.

On the one hand I lean towards local-db based services due to the privacy concerns. On the other hand, IRC channels are open to the world anyway most of the time, and we will provide a feature to disable this per-channel for private channels. Also, lookup-based services would be easier to implement in willie.

By this comparison it seems that lookup for google wins, lookup for virustotal cause there's no other option and local-db for MalwareDomains cause there's no other option.

elad661 commented 9 years ago

Fixed in 1f865709aaea1182d94021eaf1b63188a59950c2

Right now only using virustotal. I'll probably implement support for the malwaredomains blacklist soon too.