ArchiveTeam / grab-site

The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns
Other
1.31k stars 129 forks source link

Ignore local/lan-only hosts (and invalid domains). #184

Open jtagcat opened 3 years ago

jtagcat commented 3 years ago

req to localhost hmmm

time for explot-a-crawler ctf, where entity dislikes scraper scraper, but scraper go brrr; scraped content goes upload, entity goes time-for-court, and once you in american courts, you have already lost