Closed ksloan closed 10 years ago
Could you provide an example?
For example on this page there is a 'load more' button to get all the events to show in the DOM before I can scrape them. So I've set up an interval to check the status of the load more button, and click it if needed like so
if ($('.load_more_link').length > 0) {
var clicker = setInterval(function() {
$('.load_more_link').click()
if ($('.load_more_link').length == 0) {
clearInterval(clicker) // done here
};
}, 1000)
}
Then once it's done, I want to return the total number of events along with some other info... but I don't see how I can return any info once the function becomes asynchronous.
This is a common pattern.
phantomjs does have the functionality to trigger events after a page has been loaded. Maybe we can add it into ScraperPromise.js
? so that we have something like:
var scraperjs = require('scraperjs');
scraperjs.DynamicScraper.create('https://news.ycombinator.com/')
.triggerEvent(function(){
//trigger events
});
But, since phantomjs is sandboxed, we need to think of a way to tell the promise that event has finished. set a waiting time is one way (poor one).
@ksloan , for that kind of sites I find that it is better just to inspect the ajax calls and work from there. You could also try to to use the delay
promise between two scrape
promises.
@cjackie , I might look into that sometime in the future.
The v0.3.0 has an async
promise, you can see the promise which allows to check for events and then trigger them.
Amazing!! Thank you!
Can you provide example using aync?
Is there any way to use an Asynchronous function as the ScrapeFn ? I have a url where I need to set an interval to load extra data into the DOM before I actually do any scraping, and then when a certain condition is met, I do that actual scrape.
The examples show a return statement, but is there any way to do this with a callback? Thanks!