smicallef / spiderfoot

SpiderFoot automates OSINT for threat intelligence and mapping your attack surface.
http://www.spiderfoot.net
MIT License
13.24k stars 2.29k forks source link

RIPE module is slow #366

Open bcoles opened 5 years ago

bcoles commented 5 years ago

A huge amount of websites are protected by Cloudflare.

As a result, target domains resolve to Cloudflare IP addresses, with thousands of AS peers, which causes the sfp_ripe module to send thousands of queries to RIPE while attempting to lookup the AS owner (for nasn in neighs).

        # BGP AS Owner/Member -> BGP AS Peers
        if eventName.startswith("BGP_AS_"):
            neighs = self.asNeighbours(eventData)
            if neighs is None:
                self.sf.debug("No neighbors found to AS " + eventData)
                return None

            for nasn in neighs:
                if self.checkForStop():
                    return None

                ownerinfo = self.asOwnerInfo(nasn)
                ownertext = ''
                if ownerinfo is not None:
                    for k, v in ownerinfo.iteritems():
                        ownertext = ownertext + k + ": " + ', '.join(v) + "\n"

                    if len(ownerinfo) > 0:
                        evt = SpiderFootEvent("BGP_AS_PEER", nasn,
                                              self.__name__, event)
                        self.notifyListeners(evt)
smicallef commented 5 years ago

What would be a good solution or at least work-around to this? Giving the option to skip known hosting/cloud/protection providers, or when the AS peer list is above X, don't bother looking it up?

bcoles commented 5 years ago

What would be a good solution or at least work-around to this?

I'd need to take a closer look at the functionality used to retrieve neighbors and BGP_AS_OWNER. In the case of the latter, when I last read the code, it looked buggy, and unlikely to identify matches.

In the short term, a lookup_neighbors Boolean module option to switch neighbor lookup off would be a good start. Currently, this module has no module options.

This module is among the longest runtime of all modules. I usually manually research data as its returned by Spiderfoot, especially for long running scans. It is frustrating to wait an eternity for BGP data, with no other data coming in, especially when the majority of the information retrieved by this module can be retrieved almost instantly with the sfp_bgpview module.

Giving the option to skip known hosting/cloud/protection providers, or when the AS peer list is above X, don't bother looking it up?

I dislike the idea of blacklisting searching of specific providers. While Cloudflare is the most prominent example, I run into this issue frequently. Personally, I've found that the information identified is not worth the time it takes.