Open mxaddict opened 8 months ago
I have a work around that I've implemented on my end:
Which is to have a event listener logging the timestamp of the last request.
Then in a new thread, I have a loop that checks if the last request is older than timeout.
let page = Arc::new(browser.new_page("about:blank").await?);
let last_request = Arc::new(Mutex::new(Instant::now()));
let xlast_request = last_request.clone();
let mut request_paused = page.event_listener::<EventRequestPaused>().await.unwrap();
let xpage = page.clone();
let interceptor_handle = tokio::spawn(async move {
while let Some(event) = request_paused.next().await {
*xlast_request.lock().unwrap() = Instant::now();
info!(event.request.url);
if let Err(e) = xpage.execute(ContinueRequestParams::new(event.request_id.clone())).await {
error!("Failed to continue request: {e}");
}
}
});
pub async fn wait_for_page(last: Arc<Mutex<Instant>>, timeout: Duration) {
loop {
tokio::time::sleep(timeout).await;
if (last.lock().unwrap()).elapsed() > timeout {
return;
}
}
}
I guess duplicate of #36
I believe so, I did not see that issue before posting 😄
Here is my approach:
page
.evaluate(
r#"() =>
new Promise((resolve) => {
if (document.readyState === 'complete') {
resolve('completed-no-event')
} else {
addEventListener('load', () => {
resolve('complete-event')
})
}
})
"#,
)
.await?;
This will even enable single page applications to be scraped, so no web pages needs to be server side rendered.
page.wait_for_navigation().await?;
seems to return before all the pages assets (images,js,css) are fully loadedI'm trying to load a page that has some images that are added onto the page via js logic after an API request.
I was under the impression that the
page.wait_for_navigation().await?;
call would wait for these to load, but it seems it does not.Is there a way to get this to behave the way I expected it to?