By default ignore all localhost, localhost-ip6, 127.0.0.1, 192.168.5.25, 10.13.37.96, etc; disable the filter with either removing them from the global ignores, or perhaps a flag --no-ignore-lan (possible attacks, data leaks)
Probably not by default? Get a list of valid TLDs, and ignore everything else. (knock knock, it's your ISP's DNS)
hmmm
time for explot-a-crawler ctf, where entity dislikes scraper scraper, but scraper go brrr; scraped content goes upload, entity goes time-for-court, and once you in american courts, you have already lost
--no-ignore-lan
(possible attacks, data leaks)time for explot-a-crawler ctf, where entity dislikes scraper scraper, but scraper go brrr; scraped content goes upload, entity goes time-for-court, and once you in american courts, you have already lost