mozilla / geckodriver

WebDriver for Firefox
https://firefox-source-docs.mozilla.org/testing/geckodriver/
Mozilla Public License 2.0
7.03k stars 1.51k forks source link

Error when parsing with Firefox-driver on specific site? #2056

Closed Rapid1898-code closed 1 year ago

Rapid1898-code commented 1 year ago

Hello - i have the following code for loading a website to bs4 - see below

import time
import os, sys
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.firefox.service import Service
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from webdriver_manager.firefox import GeckoDriverManager

if __name__ == '__main__':
  srv=Service(GeckoDriverManager().install())  
  driver = webdriver.Firefox (service=srv)    
  waitWD = WebDriverWait (driver, 10)         

  link = f"https://www.google.com/maps/search/http%3A%2F%2Fwww.biorochelou.com" 
  # link = f"https://www.google.com/maps/search/http%3A%2F%2Fwww.microsoft.com" 
  print(link)
  driver.get (link)
  try: 
    waitWD.until(EC.element_to_be_clickable((By.XPATH, "(//button)[2]"))).click()         
  except:
    pass
  time.sleep(3)
  soup = BeautifulSoup (driver.page_source, 'html.parser')    
  driver.quit()

This works fine for many website eg. for https://www.google.com/maps/search/http%3A%2F%2Fwww.microsoft.com

When when i try to run the code for: https://www.google.com/maps/search/http%3A%2F%2Fwww.biorochelou.com

When i run the code with the Chrome-driver - everything works fine - i only get this error using Firefox-driver

$ python test1.py
[WDM] - Downloading: 19.0kB [00:00, 19.5MB/s]
https://www.google.com/maps/search/http%3A%2F%2Fwww.biorochelou.com
Traceback (most recent call last):
  File "G:\DEV\Fiverr\ORDER\robalf\test1.py", line 25, in <module>
    soup = BeautifulSoup (driver.page_source, 'html.parser')
  File "G:\DEV\.venv\selenium\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 540, in page_source
    return self.execute(Command.GET_PAGE_SOURCE)['value']
  File "G:\DEV\.venv\selenium\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 429, in execute
    self.error_handler.check_response(response)
  File "G:\DEV\.venv\selenium\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 243, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.InvalidArgumentException: Message: unexpected end of hex escape at line 1 column 870822

Operating System Windows10

Selenium version selenium 4.1.3

What are the browser(s) and version(s) where you see this issue? Firefox 102.2.0esr (64-bit)

What are the browser driver(s) and version(s) where you see this issue? 0.32.0

whimboo commented 1 year ago

Could you please attach a trace-level log from geckodriver? Is that issue new to you or did it already exist for quite a while?

whimboo commented 1 year ago

No response from reporter. Closing the issue.