ArchiveTeam / grab-site

The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns
Other
1.32k stars 130 forks source link

grab-site users please upgrade for important dashboard fix #101

Closed ivan closed 7 years ago

ivan commented 7 years ago

Chrome (and probably Firefox) by default perform DNS prefetching on all links mentioned on a page, and the grab-site dashboard is a page with a lot of links. grab-site 1.2.1 and earlier did not opt-out of this browser mechanism; grab-site 1.2.2 and newer opts out of it with <meta http-equiv="x-dns-prefetch-control" content="off">. This should avoid flooding DNS resolvers with useless lookups, improve your privacy, and reduce power consumption.

I tested 0c43dc11f3d6cec97800e5eaf9d3dcefeb2eec1a in Chrome 59 on Linux by looking at chrome://net-internals/#dns before and after the fix. Before: the host in every URL in the dashboard would get a DNS lookup; after: no DNS lookups.

Thanks to @HarryC145 for reporting this behavior (who thanks @madyoda for helping diagnose it.)