Closed LoZio closed 5 years ago
Presumably, scanning By Required Data
chooses which modules to load based on the selected data types, rather than filtering results. Any results returned from a scan, which don't fit the selected criteria, are still reported.
In this instance, the sfp_darksearch
module also produces SEARCH_ENGINE_WEB_CONTENT
, so will also be loaded. Most of the darknet related modules produce the same events.
def producedEvents(self):
return ['DARKNET_MENTION_URL', 'DARKNET_MENTION_CONTENT', 'SEARCH_ENGINE_WEB_CONTENT']
If you're not interested in Darknet data, you may wish to disable modules which return darknet information.
$ grep -rn producedEvent modules/ -A 3 | grep -E 'DARKNET_MENTION_URL|DARKNET_MENTION_WEB_CONTENT'
modules/sfp_darksearch.py-50- return ['DARKNET_MENTION_URL', 'DARKNET_MENTION_CONTENT', 'SEARCH_ENGINE_WEB_CONTENT']
modules/sfp_onioncity.py-51- return ["DARKNET_MENTION_URL", "DARKNET_MENTION_CONTENT",
modules/sfp_onionsearchengine.py-52- return ["DARKNET_MENTION_URL", "DARKNET_MENTION_CONTENT", "SEARCH_ENGINE_WEB_CONTENT"]
modules/sfp_ahmia.py-52- return ["DARKNET_MENTION_URL", "DARKNET_MENTION_CONTENT", "SEARCH_ENGINE_WEB_CONTENT"]
modules/sfp_intelx.py-74- return ["LEAKSITE_URL", "DARKNET_MENTION_URL"]
modules/sfp_torch.py-51- return ["DARKNET_MENTION_URL", "DARKNET_MENTION_CONTENT", "SEARCH_ENGINE_WEB_CONTENT"]
Spiderfoot makes use of a modular architecture, which allows granular configuration of which modules to load, and the associated module settings. You can learn more about the module architecture here:
If you're concerned about the retrieval of darknet content over Tor, and want to disable retrieval, each of the darknet related modules expose a fetchlinks
Boolean option which can be disabled.
# Option descriptions
optdescs = {
'fetchlinks': "Fetch the darknet pages (via TOR, if enabled) to verify they mention your target.",
'max_pages': "Maximum number of pages of results to fetch."
}
Thank you for the explanation, it is clear now. This makes me vote for #281 since I resolved to use the "required data" method not to configure each time the set of modules I need.
I'm trying to start a scan without darknet information. I unselected this: But in the log I see this: And a lot of results here: