nrabinowitz / pjscrape

A web-scraping framework written in Javascript, using PhantomJS and jQuery
http://nrabinowitz.github.io/pjscrape/
MIT License
996 stars 159 forks source link

Infinite Scroll scraping #18

Closed andrewdeandrade closed 12 years ago

andrewdeandrade commented 12 years ago

I'm interested in using pjscrape to scrape pages with an infinite scroll. Trying to figure out the best way to do this. I'm happy to try and implement this feature myself, but I'm curious if you've given this any thought and if so, how you would approach doing this.

nrabinowitz commented 12 years ago

You should be able to do this without too much trouble using the async scraper - see https://github.com/nrabinowitz/pjscrape/blob/master/tests/test_async.js for an example. Setting the async option on the scraper allows you to keep on doing asyncronous stuff - e.g. using window.scrollTo() and waiting for new content to load, then scraping it - until you set _pjs.items, at which point the scraper will return control to the Pjscrape runner.

I hope that helps. (Closing this, as it's not really an issue per se.)

mackski commented 7 years ago

I'm interested to know how to obtain data returned by _pjs.items, which seems undefined when trying: links = _pjs.items.map(function(index, elem) {