Closed ZeroDot1 closed 9 months ago
See discussion in previous issue: Extract domain names without URI scheme
I'd say this issue (https://github.com/InQuest/iocextract/issues/25) is definitely still the case for this request. I think using the custom regex route is the best method to achieve this for now. While we do have the URL extraction, which may potentially catch some domains, depending on how they're structured, it would most likely miss alot of valuable domains. I would recommend experimenting with different expressions to match the type of data you're extracting and plug it into iocextract through the custom regex option: https://inquest.readthedocs.io/projects/iocextract/en/latest/#custom-regex
There is actually an example in the documentation that shows a way to extract domains from ingested URLs. There will still be some trial and error, but it should be enough to get you started.
Sometimes it is necessary to simply extract the domains and or the domains and subdomains.
And a question, are the new longer domain extensions included?