A project to collect, archive, and publish robots.txt files from across the internet - with a focus on government websites
6
stars
0
forks
source link
display / link to internet archive for URL prefix that has robots.txt entry #5
Closed
nrjones8 closed 4 years ago
e.g. if there's a
Disallow: /foia/quarterly/*
entry, then link to that prefix in the wayback machine.e.g. https://web.archive.org/web/*/https://turbotax.intuit.com/lp/* see https://github.com/internetarchive/wayback/tree/master/wayback-cdx-server