SamsungLabs / TwiTi

This is a project of "#Twiti: Social Listening for Threat Intelligence" (TheWebConf 2021)
Apache License 2.0
168 stars 44 forks source link

Does current Twiti version support extracting IOCs from links mentioned in a tweet ? #8

Closed a2t2 closed 1 year ago

a2t2 commented 1 year ago

I tried the following test wherein a took sample tweet JSON data from https://developer.twitter.com/en/docs/twitter-api/v1/data-dictionary/overview and replaced the URLs to correspond to this recent article https://www.threatfabric.com/blogs/spynote-rat-targeting-financial-institutions.html.

[
{
  "created_at": "Thu Apr 06 15:24:15 +0000 2017",
  "id_str": "850006245121695744",
  "text": "Spynote malware is on the rise. Read more at https:\/\/www.threatfabric.com\/blogs\/spynote-rat-targeting-financial-institutions.html",
  "user": {
    "id": 2244994945,
    "name": "Twitter Dev",
    "screen_name": "TwitterDev",
    "location": "Internet",
    "url": "https:\/\/dev.twitter.com\/",
    "description": "Your official source for Twitter Platform news, updates & events. Need technical help? Visit https:\/\/twittercommunity.com\/ \u2328\ufe0f #TapIntoTwitter"
  },
  "place": {   
  },
  "entities": {
    "hashtags": [      
    ],
    "urls": [
      {
        "url": "https:\/\/bit.ly\/3ZudGpZ",
        "unwound": {
          "url": "https:\/\/www.threatfabric.com\/blogs\/spynote-rat-targeting-financial-institutions.html",
          "title": "Building the Future of the Twitter API Platform"
        }
      }
    ],
    "user_mentions": [     
    ]
  }
}
]

I tried using the E flag when running the IOC extractor, but I don't see any hashes extracted even though the article has a list of hashes in the end.

$ python3 -m ioc_extractor -E ../twitter_data/test-4.json
$ cat extracted_iocs.json 
[{"iocs": {"hashes": {"sha1": [], "sha256": [], "md5": []}, "ips": [], "urls": {"urls": ["https://www.threatfabric.com/blogs/spynote-rat-targeting-financial-institutions.html"], "domain": []}}, "entities": [], "context": [], "externals": []}]

Is the above test the correct way to run the extractor to get IOCs from links mentioned in tweets ?

sole2 commented 1 year ago

Yes. -E flag enables extracting IOCs from mentioned links. But TwiTi does not handle all links. https://github.com/SamsungLabs/TwiTi/blob/f17d5ca083f6f9f6e166dd4348b057bfd8df7d33/ioc_extractor/__init__.py#L9-L27 TwiTi handles only selected list of external sources to increase accuracy of IOC and not to violate data policy of sources. If you want to extend sources, please check ioc_extractor/external_resource_parser.py.