jonhoo / fantoccini

A high-level API for programmatically interacting with web pages through WebDriver.
Apache License 2.0
1.68k stars 124 forks source link

Blocking tasks on fantoccini seems to prevent parallel execution #69

Closed bitemyapp closed 4 years ago

bitemyapp commented 4 years ago

First: thank you for making fantoccini, it made getting this little app together a breeze!

I guess the million dollar question is: is it meant to work in parallel with multiple clients spawned?

More context here: https://github.com/bikeshedder/deadpool/issues/31

In summary: I'm using fantoccini, tokio, and deadpool. deadpool seems to do the right thing when I wrote weird code to tickle it. A mock client that @bikeshedder made seemed to exhibit the correct behavior.

My profile shows a lot of blocking on the tokio thread pool, so I'm assuming I'm blocking on something somewhere that is effectively preventing parallel operation even when I have oodles of threads available.

jonhoo commented 4 years ago

Yes, you should be able to create separate Client instances, and as long as the driver you are using supports multiple concurrent sessions/windows, they should be able to operate independently. From memory, chromedriver does, but geckodriver does not.

bitemyapp commented 4 years ago

@jonhoo I'm using chromedriver 79. Do you have any suggestions for how I might debug this and determine where I'm blocking and perhaps narrow down what's making my tasks not execute simultaneously?

bitemyapp commented 4 years ago

I tried spawning four chromedriver processes on different ports and then using an unmanaged pool:

    let pool = Pool::from(vec![
        create_client("9515").await?,
        create_client("9516").await?,
        create_client("9517").await?,
        create_client("9518").await?,
    ]);

It evinces the same behavior.

jonhoo commented 4 years ago

Can you try not using Pool for this, and instead do something like:

for _ in 0..4 {
    tokio::spawn(async {
        let c = create_client().await.unwrap();
        c.goto("https://rust-lang.org").await.unwrap();
    });
}
bitemyapp commented 4 years ago

@jonhoo https://gist.github.com/bitemyapp/de7826db9ef83404103124da8dfd4625

This only uses one browser window. This may be an error in my code, I'm not exactly sure yet. I think because the other workers abort on the empty queue. I'll try to work around it.

bitemyapp commented 4 years ago

@jonhoo that was it! It works perfectly now! I'll try to get a recording for @bikeshedder

bitemyapp commented 4 years ago

Here's the version that makes simultaneous progress: https://gist.github.com/bitemyapp/f32e1ddac4cd2327dab8cee926eb7aec

jonhoo commented 4 years ago

What was the change you had to make?

bitemyapp commented 4 years ago

@jonhoo I took the pool out of the equation entirely. Clients are bound to a single task. This isn't totally unreasonable because 99% of the time is spent playing pattycake with the browser anyway. I'm believe deadpool is the issue here given the fix and the fact that the managed mode was only ever instantiating a single browser.

bitemyapp commented 4 years ago

@jonhoo wasn't deadpool or fantoccini!

https://github.com/bikeshedder/deadpool/issues/31#issuecomment-575770999

My worker loop was bad :)

Thanks for your help!