smicallef / spiderfoot

SpiderFoot automates OSINT for threat intelligence and mapping your attack surface.
http://www.spiderfoot.net
MIT License
12.8k stars 2.24k forks source link

Help on usage of sfp_alienvault module needed #1401

Closed nonefaken closed 3 years ago

nonefaken commented 3 years ago

Hello!

Thank you for the spiderfoot project! I also like allot that it is actively maintained!

I have trouble getting results from sfp_alienvault module, however if i request alienvault services api directly, data is properly returned.

I`m mostly interested in this code sniped and its execution result: https://github.com/smicallef/spiderfoot/blob/29f311ccd0732709c194fcf08f054775bc24faf2/modules/sfp_alienvault.py#L211-L215

I'm not quite sure where i should configure {params}, but {qry} is scam-domain.org.

Than i execute sfp_alienvault module with scam-domain.org as input it produces little result:

root@osint-dev-debian-s-1vcpu-2gb-fra1-01:/opt/spiderfoot# docker exec -it spiderfootd ./sf.py -d -m sfp_alienvault -s scam-domain.org
Source                          Type                                            Data
2021-09-24 13:41:09,844 [INFO] Modules enabled (3): sfp_alienvault,sfp__stor_db,sfp__stor_stdout
2021-09-24 13:41:11,954 [INFO] Scan [1A98673E] for 'scam-domain.org' initiated.
2021-09-24 13:41:12,163 [DEBUG] sfscan : Loading {len(self.__moduleList)} modules ...
2021-09-24 13:41:12,203 [INFO] sfp_alienvault module loaded.
2021-09-24 13:41:12,250 [INFO] sfp__stor_db module loaded.
2021-09-24 13:41:12,295 [INFO] sfp__stor_stdout module loaded.
2021-09-24 13:41:12,306 [DEBUG] sfscan : Scan [1A98673E] loaded 3 modules.
2021-09-24 13:41:12,414 [DEBUG] sfscan : waitForThreads() got event, ROOT, from eventQueue.
2021-09-24 13:41:12,421 [DEBUG] sfscan : waitForThreads() got event, INTERNET_NAME, from eventQueue.
2021-09-24 13:41:12,429 [DEBUG] sfscan : waitForThreads() got event, DOMAIN_NAME, from eventQueue.
2021-09-24 13:41:12,708 [DEBUG] spiderfoot.plugin : sfp__stor_stdout.threadWorker() got event, ROOT, from incomingEventQueue.
2021-09-24 13:41:12,710 [DEBUG] spiderfoot.plugin : sfp__stor_db.threadWorker() got event, ROOT, from incomingEventQueue.
SpiderFoot UI                   Internet Name                                   scam-domain.org
2021-09-24 13:41:12,713 [DEBUG] spiderfoot.plugin : sfp__stor_stdout.threadWorker() got event, INTERNET_NAME, from incomingEventQueue.
SpiderFoot UI                   Domain Name                                     scam-domain.org
2021-09-24 13:41:12,714 [DEBUG] spiderfoot.plugin : sfp_alienvault.threadWorker() got event, INTERNET_NAME, from incomingEventQueue.
2021-09-24 13:41:12,715 [DEBUG] modules.sfp_alienvault : Received event, INTERNET_NAME, from SpiderFoot UI
2021-09-24 13:41:12,718 [DEBUG] spiderfoot.plugin : sfp__stor_stdout.threadWorker() got event, DOMAIN_NAME, from incomingEventQueue.
2021-09-24 13:41:12,720 [DEBUG] modules.sfp__stor_db : Storing an event: ROOT
2021-09-24 13:41:12,721 [INFO] modules.sfp_alienvault : Fetching (GET): https://otx.alienvault.com/api/v1/indicators/hostname/scam-domain.org/url_list?page=1&limit=50 (proxy=None, user-agent=SpiderFoot, timeout=5, cookies=None)
2021-09-24 13:41:12,750 [DEBUG] spiderfoot.plugin : sfp__stor_db.threadWorker() got event, INTERNET_NAME, from incomingEventQueue.
2021-09-24 13:41:12,751 [DEBUG] modules.sfp__stor_db : Storing an event: INTERNET_NAME
2021-09-24 13:41:12,775 [INFO] modules.sfp_alienvault : Fetched https://otx.alienvault.com/api/v1/indicators/hostname/scam-domain.org/url_list?page=1&limit=50 (1330 bytes in 0.0578150749206543s)
2021-09-24 13:41:12,777 [DEBUG] spiderfoot.plugin : sfp__stor_db.threadWorker() got event, DOMAIN_NAME, from incomingEventQueue.
2021-09-24 13:41:12,778 [DEBUG] modules.sfp__stor_db : Storing an event: DOMAIN_NAME
2021-09-24 13:41:12,798 [DEBUG] sfscan : waitForThreads() got event, LINKED_URL_INTERNAL, from eventQueue.
2021-09-24 13:41:12,811 [DEBUG] sfscan : waitForThreads() got event, LINKED_URL_INTERNAL, from eventQueue.
2021-09-24 13:41:12,823 [DEBUG] sfscan : waitForThreads() got event, LINKED_URL_INTERNAL, from eventQueue.
2021-09-24 13:41:12,836 [DEBUG] sfscan : waitForThreads() got event, LINKED_URL_INTERNAL, from eventQueue.
sfp_alienvault                  Linked URL - Internal                           https://scam-domain.org/
2021-09-24 13:41:13,024 [DEBUG] spiderfoot.plugin : sfp__stor_stdout.threadWorker() got event, LINKED_URL_INTERNAL, from incomingEventQueue.
sfp_alienvault                  Linked URL - Internal                           http://scam-domain.org/
2021-09-24 13:41:13,028 [DEBUG] spiderfoot.plugin : sfp__stor_stdout.threadWorker() got event, LINKED_URL_INTERNAL, from incomingEventQueue.
sfp_alienvault                  Linked URL - Internal                           http://scam-domain.org
2021-09-24 13:41:13,031 [DEBUG] spiderfoot.plugin : sfp__stor_stdout.threadWorker() got event, LINKED_URL_INTERNAL, from incomingEventQueue.
sfp_alienvault                  Linked URL - Internal                           https://scam-domain.org
2021-09-24 13:41:13,035 [DEBUG] spiderfoot.plugin : sfp__stor_stdout.threadWorker() got event, LINKED_URL_INTERNAL, from incomingEventQueue.
2021-09-24 13:41:13,084 [DEBUG] spiderfoot.plugin : sfp__stor_db.threadWorker() got event, LINKED_URL_INTERNAL, from incomingEventQueue.
2021-09-24 13:41:13,085 [DEBUG] modules.sfp__stor_db : Storing an event: LINKED_URL_INTERNAL
2021-09-24 13:41:13,088 [DEBUG] spiderfoot.plugin : sfp__stor_db.threadWorker() got event, LINKED_URL_INTERNAL, from incomingEventQueue.
2021-09-24 13:41:13,089 [DEBUG] modules.sfp__stor_db : Storing an event: LINKED_URL_INTERNAL
2021-09-24 13:41:13,101 [DEBUG] spiderfoot.plugin : sfp__stor_db.threadWorker() got event, LINKED_URL_INTERNAL, from incomingEventQueue.
2021-09-24 13:41:13,102 [DEBUG] modules.sfp__stor_db : Storing an event: LINKED_URL_INTERNAL
2021-09-24 13:41:13,115 [DEBUG] spiderfoot.plugin : sfp__stor_db.threadWorker() got event, LINKED_URL_INTERNAL, from incomingEventQueue.
2021-09-24 13:41:13,117 [DEBUG] modules.sfp__stor_db : Storing an event: LINKED_URL_INTERNAL
2021-09-24 13:41:13,339 [DEBUG] spiderfoot.plugin : sfp__stor_stdout.threadWorker() got "FINISHED" from incomingEventQueue.
2021-09-24 13:41:13,386 [DEBUG] spiderfoot.plugin : sfp_alienvault.threadWorker() got "FINISHED" from incomingEventQueue.
2021-09-24 13:41:13,424 [DEBUG] spiderfoot.plugin : sfp__stor_db.threadWorker() got "FINISHED" from incomingEventQueue.
2021-09-24 13:41:13,531 [INFO] Scan [1A98673E] completed.
2021-09-24 13:41:14,463 [INFO] Scan completed with status FINISHED

I do not see api/v1/indicators/domain executed. Is it implemented?

If i request directly api/v1/indicators/domain API endpoint:

JJ@PC testing % CALL="api/v1/indicators/domain/scam-domain.org/url_list?scam-domain.org"
JJ@PC testing % curl https://otx.alienvault.com/$CALL -H "X-OTX-API-KEY: some-api-key" | jq
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  3333  100  3333    0     0   8634      0 --:--:-- --:--:-- --:--:--  8612
{
  "url_list": [
    {
      "url": "https://inpost-oreder.scam-domain.org",
      "date": "2021-09-17T16:37:13",
      "domain": "scam-domain.org",
      "hostname": "inpost-oreder.scam-domain.org",
      "result": {
        "urlworker": {
          "http_code": 0
        },
        "safebrowsing": {
          "matches": []
        }
      },
      "httpcode": 0,
      "gsb": [],
      "encoded": "https%3A//inpost-oreder.scam-domain.org"
    },
    {
      "url": "http://inpost-oreder.scam-domain.org/",
      "date": "2021-09-10T00:02:08",

Please suggest what i might be doing wrong here so it does. not work as expected.

Thank you!

bcoles commented 3 years ago

I'm not quite sure where i should configure {params}, but {qry} is scam-domain.org.

params is not configurable. This is set and used internally by the module.

I do not see api/v1/indicators/domain executed. Is it implemented?

This module is intended to find links for a host name, not subdomains of a domain.

There's a queryDomainUrlList function for /api/v1/indicators/domain but this function is not used. The module does not watch for DOMAIN_NAME events.

https://github.com/smicallef/spiderfoot/blob/29f311ccd0732709c194fcf08f054775bc24faf2/modules/sfp_alienvault.py#L201-L217

The module instead uses /api/v1/indicators/hostname/{qry}/url_list in the queryHostnameUrlList function. This function is run for every INTERNET_NAME event.

Each DOMAIN_NAME event should also create an INTERNET_NAME event. ie, in your logs:

SpiderFoot UI                   Internet Name                                   scam-domain.org
SpiderFoot UI                   Domain Name                                     scam-domain.org

The INTERNET_NAME event will be sent to this module.

I took a guess at what your input domain name was and was able to replicate your results.

{'url_list': [{'url': 'https://spam-domain.org/', 'date': '2021-09-22T04:18:17', 'domain': 'spam-domain.org', 'hostname': 'spam-domain.org', 'result': {'urlworker': {'http_code': 0}, 'safebrowsing': {'matches': []}}, 'httpcode': 0, 'gsb': [], 'encoded': 'https%3A//spam-domain.org/'}, {'url': 'http://spam-domain.org/', 'date': '2021-09-22T04:18:17', 'domain': 'spam-domain.org', 'hostname': 'spam-domain.org', 'result': {'urlworker': {'http_code': 0}, 'safebrowsing': {'matches': []}}, 'httpcode': 0, 'gsb': [], 'encoded': 'http%3A//spam-domain.org/'}, {'url': 'http://spam-domain.org', 'date': '2021-09-21T05:20:52', 'domain': 'spam-domain.org', 'hostname': 'spam-domain.org', 'result': {'urlworker': {'ip': '172.217.14.196', 'http_code': 200}, 'safebrowsing': {'matches': []}}, 'httpcode': 200, 'gsb': [], 'encoded': 'http%3A//spam-domain.org'}, {'url': 'https://spam-domain.org', 'date': '2021-09-21T05:14:30', 'domain': 'spam-domain.org', 'hostname': 'spam-domain.org', 'result': {'urlworker': {'ip': '142.250.217.68', 'http_code': 200}, 'safebrowsing': {'matches': []}}, 'httpcode': 200, 'gsb': [], 'encoded': 'https%3A//spam-domain.org'}], 'page_num': 1, 'limit': 50, 'paged': True, 'has_next': False, 'full_size': 4, 'actual_size': 4}

The following change allows retrieving links for subdomains:

-                data = self.queryHostnameUrlList(eventData, page=page)
+                data = self.queryDomainUrlList(eventData, page=page)

The results:

{'url_list': [{'url': 'http://booking.spam-domain.org/', 'date': '2021-09-22T05:13:48', 'domain': 'spam-domain.org', 'hostname': 'booking.spam-domain.org', 'result': {'urlworker': {'http_code': 0}, 'safebrowsing': {'matches': []}}, 'httpcode': 0, 'gsb': [], 'encoded': 'http%3A//booking.spam-domain.org/'}, {'url': 'http://inpost-order.spam-domain.org/', 'date': '2021-09-22T05:10:06', 'domain': 'spam-domain.org', 'hostname': 'inpost-order.spam-domain.org', 'result': {'urlworker': {'http_code': 0}, 'safebrowsing': {'matches': []}}, 'httpcode': 0, 'gsb': [], 'encoded': 'http%3A//inpost-order.spam-domain.org/'}, {'url': 'https://spam-domain.org/', 'date': '2021-09-22T04:18:17', 'domain': 'spam-domain.org', 'hostname': 'spam-domain.org', 'result': {'urlworker': {'http_code': 0}, 'safebrowsing': {'matches': []}}, 'httpcode': 0, 'gsb': [], 'encoded': 'https%3A//spam-domain.org/'}, {'url': 'http://spam-domain.org/', 'date': '2021-09-22T04:18:17', 'domain': 'spam-domain.org', 'hostname': 'spam-domain.org', 'result': {'urlworker': {'http_code': 0}, 'safebrowsing': {'matches': []}}, 'httpcode': 0, 'gsb': [], 'encoded': 'http%3A//spam-domain.org/'}, {'url': 'http://allegro-order.spam-domain.org/', 'date': '2021-09-21T23:12:10', 'domain': 'spam-domain.org', 'hostname': 'allegro-order.spam-domain.org', 'result': {'urlworker': {'http_code': 0}, 'safebrowsing': {'matches': []}}, 'httpcode': 0, 'gsb': [], 'encoded': 'http%3A//allegro-order.spam-domain.org/'}, {'url': 'https://olx-order.spam-domain.org/', 'date': '2021-09-21T06:32:18', 'domain': 'spam-domain.org', 'hostname': 'olx-order.spam-domain.org', 'result': {'urlworker': {'ip': '142.251.33.68', 'http_code': 200}, 'safebrowsing': {'matches': []}}, 'httpcode': 200, 'gsb': [], 'encoded': 'https%3A//olx-order.spam-domain.org/'}, {'url': 'http://spam-domain.org', 'date': '2021-09-21T05:20:52', 'domain': 'spam-domain.org', 'hostname': 'spam-domain.org', 'result': {'urlworker': {'ip': '172.217.14.196', 'http_code': 200}, 'safebrowsing': {'matches': []}}, 'httpcode': 200, 'gsb': [], 'encoded': 'http%3A//spam-domain.org'}, {'url': 'https://spam-domain.org', 'date': '2021-09-21T05:14:30', 'domain': 'spam-domain.org', 'hostname': 'spam-domain.org', 'result': {'urlworker': {'ip': '142.250.217.68', 'http_code': 200}, 'safebrowsing': {'matches': []}}, 'httpcode': 200, 'gsb': [], 'encoded': 'https%3A//spam-domain.org'}], 'page_num': 1, 'limit': 50, 'paged': True, 'has_next': False, 'full_size': 8, 'actual_size': 8}

However, I wouldn't recommend using this simple patch. It won't create new INTERNET_NAME events (and obviously breaks the existing hostname check).

You're correct that this module should also check for subdomains. I'll add it to my TODO list.

nonefaken commented 3 years ago

Understood. Thank you for reply!

bcoles commented 3 years ago

You're correct that this module should also check for subdomains. I'll add it to my TODO list.

This has been implemented on master in #1472.