microsoft / playwright

Playwright is a framework for Web Testing and Automation. It allows testing Chromium, Firefox and WebKit with a single API.
https://playwright.dev
Apache License 2.0
66.08k stars 3.61k forks source link

[Bug]: proxy not working for any browser #32902

Open jnganzh opened 5 days ago

jnganzh commented 5 days ago

Version

1.47

Steps to reproduce

from playwright.sync_api import sync_playwright
import json
import random

def test_proxy_ip():
    with sync_playwright() as p:
        proxy = {"server":" "<my_proxy_server>:10001"}
        browser = p.firefox.launch(headless=True, proxy=proxy)

        try:
            context = browser.new_context()
            page = context.new_page()

            # Navigate to an IP checking website
            page.goto("https://api.ipify.org?format=json", wait_until="networkidle")

            # Get the IP address
            content = page.content()
            ip_json = page.evaluate("() => document.body.textContent")
            ip_address = json.loads(ip_json)['ip']

            print(f"Your IP address as seen by the website: {ip_address}")

        except Exception as e:
            print(f"An error occurred while fetching the IP: {e}")

        finally:
            browser.close()

def main():
    try:
        test_proxy_ip()
    except Exception as e:
        print(f"An error occurred: {e}")

if __name__ == "__main__":
    main()

Expected behavior

Script should return different IPs, when I run it with or without the proxy server

Actual behavior

Script returns same IP, no matter which proxy I use.

Additional context

I am sure the proxies work because when using the requests library ip address changes. I tried using both chromium and firefox browsers Similar issue to https://github.com/microsoft/playwright-python/issues/2517

Environment

- Operating System: [Windows 11, Linux]
- CPU: [x64]
- Browser: [Chromium]
- Python Version: [3.12.6]
- Other info:
Skn0tt commented 4 days ago

Hi @jnganzh! I'm having trouble reproducing what you describe. Could you provide a minimal reproduction based on a transparent Proxy?

To reproduce, I used your snippet and put in localhost:5555 as the proxy server. Then I started an HTTP server on that port using nc -l 5555. Running your script, this is what I see on the HTTP server:

❯ nc -l 5555
CONNECT api.ipify.org:443 HTTP/1.1
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:128.0) Gecko/20100101 Firefox/128.0
Proxy-Connection: keep-alive
Connection: keep-alive
Host: api.ipify.org:443
...

So from what i'm seeing on my machine, Playwright is definitely connecting via the HTTP Proxy. Could you provide me with a reproduction case that uses nc -l 5555 as a mock Proxy? That way, we can rule out that your Proxy config is at fault.

jnganzh commented 4 days ago

Hi @jnganzh! I'm having trouble reproducing what you describe. Could you provide a minimal reproduction based on a transparent Proxy?

To reproduce, I used your snippet and put in localhost:5555 as the proxy server. Then I started an HTTP server on that port using nc -l 5555. Running your script, this is what I see on the HTTP server:

❯ nc -l 5555
CONNECT api.ipify.org:443 HTTP/1.1
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:128.0) Gecko/20100101 Firefox/128.0
Proxy-Connection: keep-alive
Connection: keep-alive
Host: api.ipify.org:443
...

So from what i'm seeing on my machine, Playwright is definitely connecting via the HTTP Proxy. Could you provide me with a reproduction case that uses nc -l 5555 as a mock Proxy? That way, we can rule out that your Proxy config is at fault.

Thanks for looking into this. I ran the code on an aws EC2 and it actually works, so its not a issue with the proxy or an issue with playwright-python. I also ran this using node-js. It seems like this issue is due to something in my environment, I am using windows 11 with python 3.12. Do you know what other possible causes this could be due to?

Skn0tt commented 4 days ago

Sorry, no idea what this might be caused by. I'd recommend you try to debug it similarly to me with nc -l on Windows. https://nmap.org/ncat/ seems to be a good Windows-compatible implementation.

Skn0tt commented 4 days ago

You mention Python - this also lists how to run a debugging HTTP server on Windows with the Python stdlib: https://ryanblunden.com/create-a-http-server-with-one-command-thanks-to-python-29fcfdcd240e

jnganzh commented 4 days ago

Thanks for the tips Simon!