ruipgil / scraperjs

A complete and versatile web scraper.
MIT License
3.7k stars 188 forks source link

Requiring scraper prevents capturing process signals #47

Closed erikhazzard closed 8 years ago

erikhazzard commented 8 years ago

Not sure if this should be considered a bug or not, but it should be documented somewhere. After scraperjs is required, process.on(SIGNAL, callback) callbacks are never called. This prevents the ability to do things like hooking into process terminations and sending logs somewhere, or doing any sort of post-process termination cleanups.

It is easily reproducable, here are four examples (just run each script and kill the process): https://gist.github.com/enoex/84dca1dae510c671a537

Not sure that there needs to be solved by scraperjs itself, as one solution is to just create a script and run it as a child process from a server, but it's a side effect that should be documented. Thanks!

ruipgil commented 8 years ago

This is really interesting, something tells me it may have to do with phantom-node since they do some crazy stuff over there.

I'll investigate.

ruipgil commented 8 years ago

This is resolved on phantom v0.8.1.