wkeeling / selenium-wire

Extends Selenium's Python bindings to give you the ability to inspect requests made by the browser.
MIT License
1.9k stars 254 forks source link

Firefox bypasses Selenium Wire for localhost addresses #326

Closed zailaib closed 3 years ago

zailaib commented 3 years ago

client:

from selenium.webdriver import FirefoxProfile
from seleniumwire import webdriver

GECKO_PATH = './drivers/geckodriver'

def interceptor(request):
    del request.headers['User-Agent']
    request.headers['User-Agent'] = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.101 Safari/537.36"

def init_driver():
    options = dict()
    profile = FirefoxProfile()
    driver = webdriver.Firefox(
        executable_path=GECKO_PATH,
        seleniumwire_options=options,
        firefox_profile=profile
    )

    return driver

def main():
    driver = init_driver()
    driver.request_interceptor = interceptor
    driver.get('http://127.0.0.1:8000/')
    driver.quit()

if __name__ == '__main__':
    main()

service to print headers:

from sanic import Sanic
from sanic.response import json

app = Sanic("My Hello, world app")

@app.route('/')
async def test(request):
    print('-'*30)
    for k, v in dict(request.headers).items():
        print(f'"{k}": "{v}"')
    print('-'*30)

    return json({'hello': 'world'})

if __name__ == '__main__':
    app.run()

printed headers here:

------------------------------
"host": "127.0.0.1:8000"
"user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:88.0) Gecko/20100101 Firefox/88.0"
"accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8"
"accept-language": "en-US,en;q=0.5"
"accept-encoding": "gzip, deflate"
"connection": "keep-alive"
"upgrade-insecure-requests": "1"
------------------------------
wkeeling commented 3 years ago

Thanks for raising this.

This is happening because Firefox bypasses Selenium Wire's internal proxy server for localhost (or 127.0.0.1) addresses. Selenium Wire uses this proxy to intercept and modify requests. So when bypassed, the header replacement doesn't work.

To tell Firefox not to do this, it's necessary to set a Firefox specific preference when you create the webdriver. See below with changed lines highlighted.

+ from selenium.webdriver import FirefoxProfile, FirefoxOptions

def init_driver():
+   firefox_options = FirefoxOptions()
+   firefox_options.set_preference('network.proxy.allow_hijacking_localhost', True)
    options = dict()
    profile = FirefoxProfile()
    driver = webdriver.Firefox(
        executable_path=GECKO_PATH,
+       firefox_options=firefox_options,
        seleniumwire_options=options,
        firefox_profile=profile
    )

    return driver

Chrome actually does the same thing and we've disabled it by default when you use the ChromeDriver, but we haven't done it for Firefox. I'll look at adding that in.

Thanks again!

BinaryCanon commented 3 years ago

Greetings, I think I have the same problem with Chrome. I'm trying the example from the docs, but the Referer never change:


from seleniumwire import webdriver
PATH = "/home/chromedriver"

def interceptor(request):
    del request.headers['Referer']  # Remember to delete the header first
    request.headers['Referer'] = 'https://referer.com'  # Spoof the referer

driver = webdriver.Chrome(executable_path=PATH)
driver.request_interceptor = interceptor
driver.get('https://mywebsite.com/')
wkeeling commented 3 years ago

@irfan315 how are you verifying that the header has not changed? Also, have you tried adding a brand new header?

BinaryCanon commented 3 years ago

@wkeeling Yes, I've tried this example too, no luck. I'm verifying using Chrome's developer tools, examining the headers under the 'network' tab.

wkeeling commented 3 years ago

@irfan315 Chrome developer tools won't show the changes because Selenium Wire modifies the requests after they leave the browser. You'll need to use http://httpbin.org/headers to verify the changes or alternatively inspect the requests captured by driver.requests

BinaryCanon commented 3 years ago

@wkeeling Thanks, I understand now. Using http://httpbin.org/headers indeed display it as you mentioned.

jaymc-arg commented 2 years ago

Chrome actually does the same thing and we've disabled it by default when you use the ChromeDriver, but we haven't done it for Firefox. I'll look at adding that in.

Thanks again!

Hi @wkeeling I'm trying to intercept localhost request for a local file form a Chrome/Chromium browser but selenium-wire ignores localhost requests. Can you explain a bit more please? This is what I'm trying:

import time
from seleniumwire import webdriver
chrome_options = webdriver.ChromeOptions()
try:
    driver = webdriver.Chrome(
        chrome_options=chrome_options
    )
except Exception as e:
    print(e)

driver.get('http://0.0.0.0:3000/screen/a') #webpage I want to monitor

while(True):
        for request in driver.requests:
                if request.response:
                if request.response.status_code == 200:
                        print(
                        request.url,
                        request.response.status_code,
                        'Error')
                        driver.refresh()
                        time.sleep(5) 

The requests made are something like this htttp://localhost:3000/media/filename.mp4 Here is a thread in stackoverflow I've created

Thanks in advance.