ChrisWren / grunt-link-checker

Run node-simple-crawler to discover broken links on your website
MIT License
33 stars 9 forks source link

Fatal error: Maximum call stack size exceeded #31

Open nvaken opened 8 years ago

nvaken commented 8 years ago

Probably because the site that I'm crawling has an pretty high amount of resources. Though, I wonder if this isn't preventable? Am I overlooking an option here?

jejernig commented 8 years ago

bumping for same issue. Crawling a share point site with tons of links.

infomongo commented 8 years ago

Same issue for me. This happened when the site being tested added a large (9 MB) video. So I don't think it is the number of resources, for me, but the size

If there is no fix/workaround, I'm gonna have to stop using the link checker.

infomongo commented 8 years ago

The config options are here: https://github.com/cgiffard/node-simplecrawler#configuration But none of them allow me to fix/workaround my issue

Seems like one of these should do it, but I can't et them to work crawler.maxResourceSize=16777216 - The maximum resource size that will be downloaded, in bytes. Defaults to 16MB. I tried maxResourceSize: or 2MB to 32MB and got no difference in behavior Similarly downloadUnsupported: false has no affect

Doesn't seem to be a config option to ignore some file types, unless there is a way to use Fetch Conditions. Not clear this is possible.

Probably going to stop using this :(

infomongo commented 8 years ago

I was able to fix this by using fetch conditions to ignore the movie that was causing the problem. My grunt file (coffee script) looks like this:

linkChecker:
  build:
    site: 'localhost',
    options:
      initialPath: '/site-dir.html'
      maxConcurrency: 20
      initialPort: 8000
      supportedMimeTypes: [/text\/html/, /text\/css/]
      callback: (crawler)=>
        crawler.addFetchCondition((url)=>
          return !url.path.match(/\.mp4$/i)
        )
nvaken commented 8 years ago

Not sure, as I can not check this as we speak, though I'm pretty sure my original error isn't caused by one big resource. I'm pretty sure the projects that I'm checking do not have bigger resources then say ~5 MB and that would even be a anomaly. Your fix seems to do the job for specific big resources (which is good to have in here! 👍 ), though, it will probably not fix my original issue.

So, that being said, I'm still looking for answers. 😊