JustinBeckwith / linkinator

🐿 Scurry around your site and find all those broken links.
MIT License
1.02k stars 79 forks source link

Crawl entire site from sitemap when available (faster than recursive ) #346

Open antoniancu opened 2 years ago

antoniancu commented 2 years ago

I suspect performance when checking an entire site would be better with the ability to run the link checker on a set of pages provided by the sitemap when available, vs the recursive crawling process.

An anticipated complexity is really large sites with paginated sitemaps: sitemap.xml?page=1, sitemap.xml?page=2 etc..

antoniancu commented 2 years ago

I solved this by passing an array of pages from sitemapper. Still, it would be neat to have this logic built in.

JustinBeckwith commented 2 years ago

Interesting! I have to go do some learnin' about sitemaps again.