Closed liquidsec closed 1 year ago
We can fix this by overriding the _data_id()
method on URL_UNVERIFIED events to include spider-danger.
The _data_id()
method returns the event data that's used to calculate the id hash.
We would need to notify @SpamFaux because this might create some duplicate URLs in his Neo4j data.
If excavate finds a URL_UNVERIFIED, and adds spider-danger tag, and FFUF (or another module) finds the same URL_UNVERIFIED, it will be marked as a dupe and will not be visited by HTTPX