ropensci / RSelenium

An R client for Selenium Remote WebDriver
https://docs.ropensci.org/RSelenium
341 stars 81 forks source link

How to set pageLoadStrategy #246

Open ruimgbarros opened 3 years ago

ruimgbarros commented 3 years ago

Hello everyone!

I'm sorry for opening an issue for this since it is more of a question, but I don't know where to ask. I need to prevent my page to stop waiting for all the page to be loaded before starting doing stuff on the page (basically, broken js loading in some archived pages).

I've seen this solution but, to be honest, I have no idea how can I set the pageLoadStrategy with RSelenium...

Is there a way to do this?

deathmaster9 commented 3 years ago

Yes, below is an example that is working for me.

driver <- rsDriver(port = 4568L, browser=c("firefox"), extraCapabilities=ffprof) remote_driver <- driver[["client"]] remote_driver$extraCapabilities$pageLoadStrategy <- "eager"

mlane3 commented 1 year ago

@ruimgbarros @deathmaster9 Depending on the website and the task the pageLoadStrategy does not always work. But it will work if you want to do something before the page loads for firefox. Often times the things you can do though are limited.

Part of it has to do with the fact the Rselenium needs to be moved to Selenium 3.0/4.0. I suspect as the package is updated this will fixed.

The major other issues some websites act up if you try to change the page load strategy--in order to prevent web scraping This is why for our Rselenium scripts we include Sys.sleep() after every click or key entry. These are my workplace's Standard Web scraping guidelines