nrabinowitz / pjscrape

A web-scraping framework written in Javascript, using PhantomJS and jQuery
http://nrabinowitz.github.io/pjscrape/
MIT License
996 stars 159 forks source link

Add persistent data support #5

Open nrabinowitz opened 12 years ago

nrabinowitz commented 12 years ago

Add a client-side data persistence mechanism - localStorage would work (only if you're on the same domain - otherwise you'd need PhantomJS-level persistence, requiring a separate function...). Use case: scraping sub-pages of a category page and persisting the category.

More thinking on this: I could support cross-domain data using _pjs.state on the client side (default {}), grabbing it in page.open() at the end of the scrape and then writing it as JSON to the object at the beginning.

nrabinowitz commented 12 years ago

This is working in the data-persistence branch, but I'm thinking I might want it to use a private _state variable and then offer a _pjs.data() function modeled on $.data().