taganaka / polipus

Polipus: distributed and scalable web-crawler framework
MIT License
92 stars 32 forks source link

Support for headless crawling #66

Open sandeepravi opened 9 years ago

sandeepravi commented 9 years ago

Does it make sense to have support for headless crawling built-in to the framework? A lot of the websites these days are Single Page apps and crawling that using the current framework won't work.

We could try to do this using phantonjs or capybara-webkit. I've been able to do a headless crawl using capybara-website and poltergeist before.

What are your thoughts on this?