wkeeling / selenium-wire

Extends Selenium's Python bindings to give you the ability to inspect requests made by the browser.
MIT License
1.9k stars 254 forks source link

Requests are not intercepted when connecting remotely #291

Closed kshnkvn closed 3 years ago

kshnkvn commented 3 years ago

I am using Selenoid on a dedicated computer to run browsers. The connection is as follows:

from seleniumwire import webdriver

chrome_options = webdriver.ChromeOptions()

chrome_options.add_argument('disable-infobars')
chrome_options.add_argument('--disable-extensions')
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--ignore-certificate-errors')
chrome_options.add_experimental_option('prefs', prefs)
capabilities = {
    "browserName": "chrome",
    "selenoid:options": {
        "enableVNC": True
    }
}
capabilities.update(chrome_options.to_capabilities())

driver = webdriver.Remote(
    command_executor='http://<remote_ip>:4444/wd/hub',
    desired_capabilities=capabilities,
    seleniumwire_options={
        'auto_config': False,
        'addr': '0.0.0.0'
    }
)

The connection is ok, browser control works too, but when I want to get the list of requests it is empty:

driver.get('https://google.com')
print(driver.requests)

# []
wkeeling commented 3 years ago

Thanks for raising this.

The remote Chrome instance needs to be able to communicate with Selenium Wire in order to send back its requests and responses. So you need to specify the external IP of the host/container running Selenium Wire in the addr option. You also need to set auto_config to True (or omit it):

driver = webdriver.Remote(
    command_executor='http://<remote_ip>:4444/wd/hub',
    desired_capabilities=capabilities,
    seleniumwire_options={
        'auto_config': True,
        'addr': '<ip_of_selenium_wire>'  # This must be accessible by the remote browser
    }
)
kshnkvn commented 3 years ago

@wkeeling my local computer has a static IP address, I specify it as follows:

driver = webdriver.Remote(
    command_executor='http://<remote_ip>:4444/wd/hub',
    desired_capabilities=capabilities,
    seleniumwire_options={
        'auto_config': True,
        'addr': '<local_ip>'
    }
)

Now I have an error:

---------------------------------------------------------------------------
gaierror                                  Traceback (most recent call last)
C:\Python\lib\site-packages\seleniumwire\thirdparty\mitmproxy\server\server.py in __init__(self, config)
     40         try:
---> 41             super().__init__(
     42                 (config.options.listen_host, config.options.listen_port)

C:\Python\lib\site-packages\seleniumwire\thirdparty\mitmproxy\net\tcp.py in __init__(self, address)
    593             self.socket.setsockopt(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1)
--> 594             self.socket.bind(self.address)
    595 

gaierror: [Errno 11001] getaddrinfo failed

The above exception was the direct cause of the following exception:

ServerException                           Traceback (most recent call last)
<ipython-input-2-b21bf493d43e> in <module>
     47 capabilities.update(chrome_options.to_capabilities())
     48 
---> 49 driver = webdriver.Remote(
     50     command_executor='http://<remote_ip>:4444/wd/hub',
     51     desired_capabilities=capabilities,

C:\Python\lib\site-packages\seleniumwire\webdriver.py in __init__(self, seleniumwire_options, *args, **kwargs)
    180             seleniumwire_options = {}
    181 
--> 182         self.proxy = backend.create(
    183             addr=seleniumwire_options.pop('addr'),
    184             port=seleniumwire_options.get('port', 0),

C:\Python\lib\site-packages\seleniumwire\backend.py in create(addr, port, options)
     33     if backend == DEFAULT_BACKEND:
     34         # Use the default backend
---> 35         proxy = MitmProxy(addr, port, options)
     36     elif backend == 'mitmproxy':
     37         # Use mitmproxy if installed

C:\Python\lib\site-packages\seleniumwire\server.py in __init__(self, host, port, options)
     49         # Create an instance of the mitmproxy server
     50         self._master = Master(self._event_loop, mitmproxy_opts)
---> 51         self._master.server = ProxyServer(ProxyConfig(mitmproxy_opts))
     52         self._master.addons.add(*addons.default_addons())
     53         self._master.addons.add(SendToLogger())

C:\Python\lib\site-packages\seleniumwire\thirdparty\mitmproxy\server\server.py in __init__(self, config)
     47             if self.socket:
     48                 self.socket.close()
---> 49             raise exceptions.ServerException(
     50                 "Error starting proxy server: " + repr(e)
     51             ) from e

ServerException: Error starting proxy server: gaierror(11001, 'getaddrinfo failed')
kshnkvn commented 3 years ago

@wkeeling I also tried using the tunnel localtunnel

driver = webdriver.Remote(
    command_executor='http://<remote_ip>:4444/wd/hub',
    desired_capabilities=capabilities,
    seleniumwire_options={
        'auto_config': True,
        'addr': 'https://selenium.loca.lt',
        'port': 3000
    }
)
PS C:\Users\kshnk> lt -p 3000 --subdomain selenium
your url is: https://selenium.loca.lt

But the error is exactly the same

wkeeling commented 3 years ago

It seems that the IP address isn't physically bound to an interface on your machine - which is why you see the getaddrinfo failed when Selenium Wire tries to bind to it. What does ipconfig /all return if you run it from the terminal?

kshnkvn commented 3 years ago

@wkeeling

Ethernet adapter Ethernet:

   DNS-suffix of the connection. ... ... ... ... :
   Description. ... ... ... ... ... ... ... ... ... ... ... ... : Realtek PCIe GBE Family Controller
   Physical adress. ... ... ... ... ... ... ... ... : 30-5A-3A-0C-5D-CC
   DHCP is enabled. ... ... ... ... ... ... ... ... ... ... : Yes
   Auto tuning is enabled. ... ... ... ... ... : Yes
   Link-local IPv6 address. ... ... : fe80 :: 8543: 5ee0: 850f: 794f% 9 (Main)
   IPv4 address. ... ... ... ... ... ... ... ... ... ... ... : 192.168.0.100 (Main)
   Subnet mask . ... ... ... ... ... ... ... ... ... : 255.255.255.0
   The lease has been received. ... ... ... ... ... ... ... ... ... : Saturday, May 8, 2021 09:13:28
   The lease is about to expire. ... ... ... ... ... ... ... ... ... : Saturday, May 8, 2021 15:13:28
   Main gate. ... ... ... ... ... ... ... ... : 192.168.0.1
   DHCP server. ... ... ... ... ... ... ... ... ... ... : 192.168.0.1
   IAID DHCPv6. ... ... ... ... ... ... ... ... ... ... : 439376442
   DUID of DHCPv6 client. ... ... ... ... ... ... : 00-01-00-01-26-E2-A7-0C-30-5A-3A-0C-5D-CC
   DNS servers. ... ... ... ... ... ... ... ... ... ... : 192.168.0.1
                                       0.0.0.0
   NetBios over TCP / IP. ... ... ... ... ... ... ... : Switched on

Sorry, if some of the data may seem strange, my system is in russian and I have translated the output of the command

kshnkvn commented 3 years ago

@wkeeling my computer is connected to the Internet via a tp-link router, maybe this could be a problem?

wkeeling commented 3 years ago

Thanks for sharing that - I can see the issue. Your computer has a local IP address of 192.168.0.100 which isn't externally visible. That complicates things a little.

The first thing is to do is to set up a forwarding rule on your router. You'll need to forward all inbound traffic on port 12345 to IP address 192.168.0.100.

Once done, the Selenium Wire config will need to be tweaked. It will be necessary to pass an extra argument to Chrome called --proxy-server which will point at your static IP and port 12345. You'll also need to set the Selenium Wire addr to your local address 192.168.0.100, set the port to 12345 and set the auto_config to False (like you had to start). See below (important lines highlighted).

from seleniumwire import webdriver

chrome_options = webdriver.ChromeOptions()

chrome_options.add_argument('disable-infobars')
chrome_options.add_argument('--disable-extensions')
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--ignore-certificate-errors')
chrome_options.add_experimental_option('prefs', prefs)
+chrome_options.add_argument('--proxy-server=<static_ip>:12345')  # Use your static IP here
capabilities = {
    "browserName": "chrome",
    "selenoid:options": {
        "enableVNC": True
    }
}
capabilities.update(chrome_options.to_capabilities())

driver = webdriver.Remote(
    command_executor='http://<remote_ip>:4444/wd/hub',
    desired_capabilities=capabilities,
    seleniumwire_options={
+        'auto_config': False,
+        'addr': '192.168.0.100',  # This is your local address
+        'port': 12345  # Make sure this matches the port in --proxy-server argument
    }
)

The important thing of course is that you can configure your router to forward inbound traffic to port 12345. Most routers provide this functionality so hopefully it is possible. You may also have to disable your Windows firewall on your machine - so that traffic can reach Selenium Wire.

Sorry that the configuration is complicated, but unfortunately it is necessary given your current network setup.

kshnkvn commented 3 years ago

Thank you very much for your help!