Closed milahu closed 5 months ago
uhh have you tried
iframes = await driver.find_elements(By.TAG_NAME, "iframe")
await asyncio.sleep(0.5)
iframe_document = await iframes[0].content_document
# iframe_document.find_elements(...)
? General rule here is: only cors iframes have a target due to OOPIF. see https://www.chromium.org/developers/design-documents/site-isolation/#project-tasks
driver.find_elements(By.TAG_NAME, "iframe")
By.TAG_NAME
is the only driver.find_element
variant that fails in my code
only cors iframes have a target due to OOPIF
yep, i kind-of expected that
ideally, the interface should be the same for all iframes so i dont need code like
if is_cors_iframe(iframe):
# switch target
old_target = ...
await driver.switch_to.frame(iframe)
elem = await driver.find_element(...)
# switch back
await driver.switch_to.frame(old_target)
else:
elem = await iframe.content_document.find_element(...)
probably, these target-switches are expensive so a context handler would be nice
with driver.context_of.frame(iframe) as iframe_driver:
elem = await iframe_driver.find_element(...)
driver.find_elements(By.TAG_NAME, "iframe")
By.TAG_NAME
is the onlydriver.find_element
variant that fails in my code
please specify "fails"?
Does iframe.content_element
work now btw?
ideally, the interface should be the same for all iframes so i dont need code like
if is_cors_iframe(iframe): # switch target old_target = ... await driver.switch_to.frame(iframe) elem = await driver.find_element(...) # switch back await driver.switch_to.frame(old_target) else: elem = await iframe.content_document.find_element(...)
probably, these target-switches are expensive so a context handler would be nice
with driver.context_of.frame(iframe) as iframe_driver: elem = await iframe_driver.find_element(...)
well I don't like that switching thingy anyways. My long-term plan is to deprecate it anyways and move away from selenium.
Ideall, in my oppinion.
there should be a type class Frame
which is a target can contain multiple of.
That's gonna require a lot of time & refactoring to develop tho. Also, driver.switch_to.frame
only supports Target
as an argument. Forgot to remove//deprecate it after introducing .content_element
yes : )
iframe.content_document
works for my simple test case with file urls
and it also works for captcha iframes
iframe = await driver.find_element(By.CSS_SELECTOR, "iframe")
iframe_doc = await iframe.content_document
elem = await iframe_doc.find_element(By.CSS_SELECTOR, "h2")
driver.find_element(By.TAG_NAME, "iframe")
fails because
- WebElement("None", obj_id=None, node_id="None", backend_node_id=10, context_id=None)
+ WebElement("HTMLIFrameElement", obj_id=-975510912079378788.4.3, node_id="None", backend_node_id=8, context_id=4)
low priority, because this affects only file urls
get_target_for_iframe
works on http urls, where the iframe source ishttps://www.recaptcha.net/recaptcha/api2/anchor?...
but
get_target_for_iframe
fails on file urlsdriver.targets
has no target withtarget.type == "iframe"
(or"frame"
)maybe because cross-origin policy?
also
driver.find_element(By.TAG_NAME, "iframe")
fails to find the iframe element and all other versions ofdriver.find_element
are workingSelenium-Driverless version 1.7.1
test-selenium-driverless.switch-to-iframe.py
```py #!/usr/bin/env python3 import asyncio import base64 import sys import os import time import datetime import traceback import shutil from selenium_driverless import webdriver from selenium_driverless.types.by import By from cdp_socket.exceptions import CDPError # TODO use data urls instead of tempfiles # use tmpfs in RAM to avoid disk writes tempdir = f"/run/user/{os.getuid()}" if not os.path.exists(tempdir): raise ValueError(f"tempdir does not exist: {tempdir}") def datetime_str(): # https://stackoverflow.com/questions/2150739/iso-time-iso-8601-in-python#28147286 return datetime.datetime.utcnow().strftime("%Y%m%dT%H%M%S.%fZ") tempdir += "/test-selenium-driverless" print(f"TODO: rm -rf {tempdir}.*") tempdir += f".{datetime_str()}" print(f"tempdir: {tempdir}") os.makedirs(tempdir) async def main(): options = webdriver.ChromeOptions() iframe_html_path = f"{tempdir}/iframe.html" with open(iframe_html_path, "w") as f: f.write( "iframe
" ) test_html_path = f"{tempdir}/test.html" with open(test_html_path, "w") as f: f.write( "test
" f"" ) async with webdriver.Chrome(options=options, max_ws_size=2 ** 30) as driver: url = "file://" + test_html_path print(f"url: {url}") await driver.get(url) # wait for page load await asyncio.sleep(1) # findexample output
``` $ ./test-selenium-driverless.switch-to-iframe.py TODO: rm -rf /run/user/1000/test-selenium-driverless.* tempdir: /run/user/1000/test-selenium-driverless.20240110T095725.677589Z url: file:///run/user/1000/test-selenium-driverless.20240110T095725.677589Z/test.html found iframe by tag name lowercase: WebElement("None", obj_id=None, node_id="None", backend_node_id=8, context_id=None) found iframe by tag name uppercase: WebElement("None", obj_id=None, node_id="None", backend_node_id=8, context_id=None) found iframe by css selector: WebElement("HTMLIFrameElement", obj_id=2979415939007923755.4.5, node_id="None", backend_node_id=8, context_id=4) found iframe by xpath //iframe: WebElement("HTMLIFrameElement", obj_id=2979415939007923755.4.7, node_id="None", backend_node_id=8, context_id=4) found iframe by xpath /html/body/iframe: WebElement("HTMLIFrameElement", obj_id=2979415939007923755.4.9, node_id="None", backend_node_id=8, context_id=4) found iframe by xpath //*[@id="iframe_1"]: WebElement("HTMLIFrameElement", obj_id=2979415939007923755.4.11, node_id="None", backend_node_id=8, context_id=4) found iframe by id: WebElement("HTMLIFrameElement", obj_id=2979415939007923755.4.13, node_id="None", backend_node_id=8, context_id=4) driver_url: file:///run/user/1000/test-selenium-driverless.20240110T095725.677589Z/test.html page_source:test
searching h2 in iframe.html with iframe.content_document.find_element + By.CSS_SELECTOR found h2: WebElement("HTMLHeadingElement", obj_id=2979415939007923755.5.3, node_id="None", backend_node_id=10, context_id=5) searching h2 in iframe.html with iframe.content_document.find_element + By.TAG_NAME found h2: WebElement("None", obj_id=None, node_id="None", backend_node_id=10, context_id=None) searching iframe target with driver.get_target_for_iframe driver.get_target_for_iframe failed:test
searching h2 in iframe.html driver.find_element failed:not fixed by #68 #7 #9