tomnomnom / waybackurls

Fetch all the URLs that the Wayback Machine knows about for a domain
3.41k stars 455 forks source link

Update Common Crawl index #30

Closed dsaxton closed 2 years ago

dsaxton commented 2 years ago

Thanks @tomnomnom for the excellent tool. I noticed the index being used for Common Crawl is a bit out of date, so this updates to the most recent (the newer contains 3.15 billion URLs and the older 2.75 billion):

https://commoncrawl.org/2018/06/may-2018-crawl-archive-now-available/ https://commoncrawl.org/2021/08/july-august-2021-crawl-archive-available/