Closed andrewdeandrade closed 12 years ago
You should be able to do this without too much trouble using the async scraper - see https://github.com/nrabinowitz/pjscrape/blob/master/tests/test_async.js for an example. Setting the async
option on the scraper allows you to keep on doing asyncronous stuff - e.g. using window.scrollTo()
and waiting for new content to load, then scraping it - until you set _pjs.items
, at which point the scraper will return control to the Pjscrape runner.
I hope that helps. (Closing this, as it's not really an issue per se.)
I'm interested to know how to obtain data returned by _pjs.items, which seems undefined when trying: links = _pjs.items.map(function(index, elem) {
I'm interested in using pjscrape to scrape pages with an infinite scroll. Trying to figure out the best way to do this. I'm happy to try and implement this feature myself, but I'm curious if you've given this any thought and if so, how you would approach doing this.