mattsse / chromiumoxide

Chrome Devtools Protocol rust API
Apache License 2.0
753 stars 78 forks source link

iframes sometimes stop page from loading (again) #228

Open tgrushka opened 2 months ago

tgrushka commented 2 months ago

I'm still experiencing issue #163 (and I know it was fixed in #174). Tried both published version of this crate as well as directly from main branch, on both Chrome and Chromium.

  1. Config timeout is not respected (it still waits 30 seconds, no matter the launch_timeout or request_timeout value;
  2. After 30 seconds, failed to navigate: Timeout is printed;
  3. Interestingly, in both Chrome and Chromium, if I run headed, then scroll to about 2/3 of the way down the page, the page loads;
  4. EDIT: Oddly enough, this only seems to happen on certain pages (e.g. MDN); other pages load fine, e.g.:

The example code in iframe-workaround.rs seems to require launching chromium manually and has a hard-coded DevTools URL. Might that have something to do with why the example worked?

macOS Sonoma 14.5 rustc 1.79.0 Google Chrome 126.0.6478.183 Chromium 129.0.6620.0

main.rs:

use std::time::Duration;

use chromiumoxide::{Browser, BrowserConfig};
use futures::StreamExt;

#[tokio::main]
async fn main() {
    tracing_subscriber::fmt::init();

    let chromium = false;
    let headed = false;

    let mut builder = BrowserConfig::builder()
        .launch_timeout(Duration::from_secs(5))
        .request_timeout(Duration::from_secs(5));

    if chromium {
        println!("using Chromium");
        builder = builder.chrome_executable("/Applications/Chromium.app/Contents/MacOS/Chromium");
    } else {
        println!("using Google Chrome");
        builder = builder
            .chrome_executable("/Applications/Google Chrome.app/Contents/MacOS/Google Chrome");
    }

    if headed {
        builder = builder.with_head();
    }

    let config = builder.build().unwrap();

    let (mut browser, mut handler) = Browser::launch(config)
        .await
        .expect("failed to connect to browser");

    let handle = tokio::task::spawn(async move {
        while let Some(event) = handler.next().await {
            tracing::debug!(event = ?event);
        }
    });

    let page = browser
        .new_page("about:blank")
        .await
        .expect("failed to create page");

    println!("Loading iframe page...");
    let _ = page
        .goto("https://developer.mozilla.org/en-US/docs/Web/HTML/Element/iframe")
        .await
        .expect("failed to navigate");

    println!("Page loaded!");

    browser.close().await.expect("close browser");
    handle.await.expect("await handle");
}

Cargo.toml:

[package]
name = "chrome_test"
version = "0.1.0"
edition = "2021"

[dependencies]
chromiumoxide = { version = "0.6", features = ["tokio-runtime"], default-features = false }
futures = "0.3.30"
tokio = { version = "1", features = [
    "fs",
    "macros",
    "rt-multi-thread",
    "io-util",
    "sync",
    "time",
] }
tracing = "0.1.40"
tracing-subscriber = "0.3.18"
tgrushka commented 1 week ago

I found a workaround for this on the MDN IFRAME page (hopefully works on most similar pages) by injecting the following JavaScript, which calls scrollIntoView() on each IFRAME element on the page. Not ideal or good practice, but it shows that the frame must be scrolled into view in order to finish loading.

Maybe this is an issue with lazily loaded frames?

(This injection must occur BEFORE navigation to the page.)

let page = browser.new_page("about:blank").await?;

page.execute(
    AddScriptToEvaluateOnNewDocumentParams::builder()
        .source(
            "document.addEventListener('DOMContentLoaded', function() {
        const iframes = [...document.querySelectorAll('iframe')]
        for (const iframe of iframes) {
            iframe.scrollIntoView()
        }
    })",
        )
        .build()?,
)
.await?;

page.goto("https://developer.mozilla.org/en-US/docs/Web/HTML/Element/iframe").await?;