Closed jambonnade closed 4 years ago
@jambonnade You may refer to this sample wait for a specific element to load script.
function wait_for_element(splash, css, maxwait) -- Wait until a selector matches an element -- in the page. Return an error if waited more -- than maxwait seconds. if maxwait == nil then maxwait = 10 end return splash:wait_for_resume(string.format([[ function main(splash) { var selector = '%s'; var maxwait = %s; var end = Date.now() + maxwait*1000;
function check() {
if(document.querySelector(selector)) {
splash.resume('Element found');
} else if(Date.now() >= end) {
var err = 'Timeout waiting for element';
splash.error(err + " " + selector);
} else {
setTimeout(check, 200);
}
}
check();
}
]], css, maxwait)) end
function main(splash) splash:go("http://scrapinghub.com") wait_for_element(splash, "#foo") return {png=splash:png()} end
The above example works but it has a problem. If the page reloads, it interrupts the script execution. So I wrote a function purely in lua to handle this kind of problem.
function wait_for_element(splash, css, maxwait)
if maxwait == nil then
maxwait = 10
end
local exit = false
local time_chunk = 0.2
local time_passed = 0
while (exit == false)
do
local element = splash:select(css)
if element then
exit = true
elseif time_passed >= maxwait then
exit = true
error('Timed out waiting for -' .. css)
else
splash:wait(time_chunk)
time_passed = time_passed + time_chunk
end
end
end
Apparently the above function has high cpu usage, not sure why. I don't recommend using it
Hi,
First, i'd like to know exactly when the splash:go() call returns :
Then how should we deal with scripts going through multiple pages without additional go() calls ? (ex : click on links, form submits) Using wait() with cancel_on_redirect flag is a good start but again i don't know when wait() returns exactly in this case.
I don't find it's a serious way to add wait() calls with random timings to let the page finish loading, so if there is no designed way for this, i may do something like : check at some interval that the page has a specific html element or if there is a javascript variable indicating that the new page is loaded
Thanks