smicallef / spiderfoot

SpiderFoot automates OSINT for threat intelligence and mapping your attack surface.
http://www.spiderfoot.net
MIT License
13.17k stars 2.28k forks source link

[suggestion] search for email address using different obfuscations #348

Open jellevos opened 5 years ago

jellevos commented 5 years ago

Hi, I would like to contribute with this feature but I am interested to hear if you think this is valuable.

I was looking for some email address (let's say it is testmail@gmail.com) and I couldn't find a lot. By doing a lot of manual searching in the end I found an importance reference where it was obfuscated: test.mail AT gmail (dot) com, or something similar. Would it be helpful to search google for example for a different permutations of this?

codingo commented 5 years ago

If this was implemented it would be good to have a setting for it, I could see a ton of junk data coming from this when you have an initial false positive e-mail.

bcoles commented 5 years ago

Searching for email address obfuscation permutations has value. Unfortunately, there's a tonne of ways to obfuscate email addresses, and this could balloon out to a large number of requests very quickly.

Given that search engines often drop various characters, such as ( and ), a simple search for "alice at example dot com" would likely match many formats. For example, a quick google search revealed the following result for the aforementioned query alice(at)example(dot)com.

However, I'm more interested in the inverse: identifying obfuscated email addresses in event data such as SEARCH_ENGINE_WEB_CONTENT.

This would probably require some form of normalization for email addresses; perhaps as a helper method, or additional regex in the sfp_email module.

I could see a ton of junk data coming from this when you have an initial false positive e-mail.

Unfortunately, that is already a reality. Email validation usually requires contacting the target mail server, or utilising a third-party service which also contacts the target mail server, such as EmailRep.io. However, SpiderFoot does not utilise these validation services to elimimate false positives.