Closed danreeves closed 7 years ago
simplecrawler has a very lenient URL discovery mechanism designed to find as many URLs as possible. The result is it ends up crawling things that aren't URLs and this can be really slow on large sites.
simplecrawler
See here for how to fix: https://github.com/cgiffard/node-simplecrawler#link-discovery
/cc @urlsangel
Fix is in bc92f0994d42d38049cdc1f98571d5a4cb7ed20f and 5da6299deb0a6ef585acfc66d1877d85010734ad
simplecrawler
has a very lenient URL discovery mechanism designed to find as many URLs as possible. The result is it ends up crawling things that aren't URLs and this can be really slow on large sites.See here for how to fix: https://github.com/cgiffard/node-simplecrawler#link-discovery
/cc @urlsangel