Veejay / krolla

Mixed Content detection using Node.js and Puppeteer
3 stars 0 forks source link

Crawler seems slow #3

Open ashfame opened 6 years ago

ashfame commented 6 years ago

Is it just me or the crawler seemed slow even with 16 workers?

I imagine it's slow because the browser is rendering the whole page before doing anything with them, rather than just make out stuff with HTML, but thought I would check-in regarding it anyway.

Veejay commented 6 years ago

Hi,

Yeah it's definitely not fast. There are two main reasons for that:

  1. I built this as an exercise, mostly for fun with no thought about performance, usability or anything. Its highly likely this doesn't handle all sorts of corner cases I didn't think of.
  2. This is not really a crawler in the classical sense because it uses a full Chromium instance under the hood

Feel free to clone this and do what you want with it, bumping the version of Puppeteer should just be a matter of modifying the package.json file.