Open dspruell-i01 opened 4 weeks ago
Clarifying notes from external discussion:
ThreatIngestor logs make it seem like no extraction was performed for that source at all when/after it was published. Based on the logs, we're led to believe that no extraction was performed by ThreatIngestor - not that the indicators were extracted and failed to be ingested.
@pedramamini asked this:
The IOCs are not IN the feed correct? threatingestor never went out to the site to fetch the content. this isn't a bug, but rather a feature need.
Response:
<content:encoded>
tag, and that includes the IOCs.@pedramamini asked this:
How are you loading this site? https://feeds.feedburner.com/feedburner/Talos it has a bunk SSL cert.
Response:
This issue is marked as blocking since automated OSINT collection is partially broken, potentially to a significant level. It has been for some time (initially encountered nearly two years ago or longer).
We notice that some OSINT collected indicators are not currently showing up in our intelligence data stores. Historically, we've noted this as well. The historical observation is that on numerous occasions, checking for indicators we know are present in RSS or Sitemap feeds we collect do not show up in our collections. A previous Engineering resource, Trevor, started diagnosing this and reported seeing indicators being extracted but never reaching outputs. In this case, we see that indicators do not appear to be extracted (and therefore would not be seen in outputs, but should be).
Example
Source blog post:
2024-10-22 https://blog.talosintelligence.com/gophish-powerrat-dcrat/
Feed is configured in ThreatIngestor:
Feed is verified to be functional, and the target post is found in the feed content:
This sample indicator is listed in the post:
94[.]103[.]85[.]47
(94.103.85.47)This would be extracted, defanged and sent to configured output(s) by ThreatIngestor.
We have verified that extraction from the configured feed is historically functional:
...However note that the above logs show this extraction has only been performed most recently on 2024-10-09 and 2024-10-10. The post was published 2024-10-22. There has been no extraction performed for this configured source since publication (2024-10-22 or 2024-10-23, the day this issue is being reported).
Looking at the configured outputs, it appears that the _rsstalos source is configured to output to ThreatKB (C2 Feed):
However, we can confirm that when we checked ThreatKB for the target indicator (94.103.85.47), it was not found. We added it manually.
We can also confirm that the indicator did not get ingested from this source and routed to a different indicator store. When queried 2024-10-23, the indicator was only ingested from another source (Recorded Future) and routed to TIDB. It was not collected from the Talos feed.