abhinavsingh / proxy.py

💫 Ngrok FRP Alternative • ⚡ Fast • 🪶 Lightweight • 0️⃣ Dependency • 🔌 Pluggable • 😈 TLS interception • 🔒 DNS-over-HTTPS • 🔥 Poor Man's VPN • ⏪ Reverse & ⏩ Forward • 👮🏿 "Proxy Server" framework • 🌐 "Web Server" framework • ➵ ➶ ➷ ➠ "PubSub" framework • 👷 "Work" acceptor & executor framework
https://abhinavsingh.com/proxy-py-a-lightweight-single-file-http-proxy-server-in-python/
BSD 3-Clause "New" or "Revised" License
2.92k stars 569 forks source link

proxy acceptors in embed non-block mode terminated unexpectedly #1274

Open leoleelili opened 1 year ago

leoleelili commented 1 year ago

I am trying to use proxy.py in embed non-block mode to help collect data from web, however I found it can only work with non-embedded mode but failed with embedded mode.

Below is my test code (run on windows10, python 3.7.9, proxy.py v2.4.3, the code is revised from the article https://webelement.click/en/four_simple_steps_to_add_custom_http_headers_in_selenium_webdriver_tests_in_python)

from selenium.webdriver.firefox.options import Options
import proxy
import header_modifier
selenium_proxy = webdriver.Proxy()

from proxy.common import utils
proxy_port = 8899
proxy_url = '127.0.0.1:' + str(proxy_port)
with proxy.Proxy(
        ['--host', '127.0.0.1',
         '--port', str(proxy_port),
         '--num-workers', 1,
         '--log-level','d',
         '--ca-cert-file', '/test/ws-ca.pem',
         '--ca-key-file', '/test/ws-ca.key',
         '--ca-signing-key-file', '/test/ws-signing.key'],
        plugins=
        [b'header_modifier.BasicAuthorizationPlugin',
         header_modifier.BasicAuthorizationPlugin]):
    from selenium.webdriver.common.proxy import ProxyType
    selenium_proxy.proxyType = ProxyType.AUTODETECT

from selenium.webdriver import DesiredCapabilities
capabilities = DesiredCapabilities.FIREFOX
selenium_proxy.add_to_capabilities(capabilities)

options = Options()
options.headless = True
options.add_argument('--proxy-server=%s' % proxy_url)

driver = webdriver.Firefox(options=options,capabilities=capabilities)
driver.get('https://www.webelement.click/stand/basic?lang=en')
time.sleep(5)
assert driver.find_element(By.TAG_NAME, 'h2').text == 'You have authorized successfully!'
driver.quit()

When I run above code, it will produce below message:

2022-10-01 23:29:28,478 - pid:69024 [I] plugins.load:85 - Loaded plugin proxy.http.proxy.HttpProxyPlugin 2022-10-01 23:29:28,478 - pid:69024 [I] plugins.load:85 - Loaded plugin header_modifier.BasicAuthorizationPlugin 2022-10-01 23:29:28,478 - pid:69024 [I] plugins.load:85 - Loaded plugin main.BasicAuthorizationPlugin 2022-10-01 23:29:28,484 - pid:69024 [I] tcp.listen:82 - Listening on 127.0.0.1:8899 2022-10-01 23:29:28,713 - pid:69024 [D] pool._start:151 - Started acceptor#0 process 62732 2022-10-01 23:29:28,713 - pid:69024 [I] pool.setup:108 - Started 1 acceptors in threaded mode 2022-10-01 23:29:28,715 - pid:69024 [I] pool.shutdown:125 - Shutting down 1 acceptors 2022-10-01 23:29:29,869 - pid:62732 [D] acceptor.run:182 - Acceptor#0 shutdown 2022-10-01 23:29:29,949 - pid:69024 [D] pool.shutdown:130 - Acceptors shutdown ...

please notice that the acceptor was shutdown soon just after started.

To verify the settings, I break the process to two steps: the first step is to start proxy in standard (non-embed) mode as below, i.e. start it in commad line like (need to add path of header_modifier.py to PYTHONPATH first): proxy --host 127.0.0.1 --port 8899 --num-workers 1 --log-level d --ca-cert-file /test/ws-ca.pem --ca-key-file /test/ws-ca.key --ca-signing-key-file /test/ws-signing.key --plugins header_modifier.BasicAuthorizationPlugin

The proxy looks work well and not terminated soon:

2022-10-01 23:21:17,402 - pid:70548 [I] plugins.load:85 - Loaded plugin proxy.http.proxy.HttpProxyPlugin 2022-10-01 23:21:17,403 - pid:70548 [I] plugins.load:85 - Loaded plugin header_modifier.BasicAuthorizationPlugin 2022-10-01 23:21:17,410 - pid:70548 [I] tcp.listen:82 - Listening on 127.0.0.1:8899 2022-10-01 23:21:17,802 - pid:70548 [D] pool._start:151 - Started acceptor#0 process 44324 2022-10-01 23:21:17,802 - pid:70548 [I] pool.setup:108 - Started 1 acceptors in threaded mode

the second step is then to test the proxy by below code: from selenium.webdriver.firefox.options import Options from selenium.webdriver.common.proxy import ProxyType

selenium_proxy = webdriver.Proxy()

from selenium.webdriver.common.proxy import ProxyType
selenium_proxy.proxyType = ProxyType.AUTODETECT

from selenium.webdriver import DesiredCapabilities
capabilities = DesiredCapabilities.FIREFOX
selenium_proxy.add_to_capabilities(capabilities)

options = Options()
options.headless = True

proxy_port = 8899
proxy_url = '127.0.0.1:' + str(proxy_port)

#options.add_argument('--proxy-server=%s' % proxy) 
options.add_argument("--proxy-server=http://{0}".format(proxy_url))

driver = webdriver.Firefox(options=options,capabilities=capabilities)
#driver.maximize_window()

driver.get('https://www.webelement.click/stand/basic?lang=en')
assert driver.find_element(By.CSS_SELECTOR, '.post-body h2').text == 'You have authorized successfully!'

the test results show that it can work as expected.

So it seems that standard mode can work but embed non-block mode dosen't work, can any one help on this issue?

Thanks!, Leo

p.s. I also copied the sample plugin code as below (save as header_modifier.py) from reference site:

from proxy.http.proxy import HttpProxyBasePlugin from proxy.http.parser import HttpParser from typing import Optional import base64

class BasicAuthorizationPlugin(HttpProxyBasePlugin): """Modifies request headers."""

def before_upstream_connection(
        self, request: HttpParser) -> Optional[HttpParser]:
    return request

def handle_client_request(
        self, request: HttpParser) -> Optional[HttpParser]:
    basic_auth_header = 'Basic ' + base64.b64encode('webelement:click'.encode('utf-8')).decode('utf-8')
    request.add_header('Authorization'.encode('utf-8'), basic_auth_header.encode('utf-8'))
    return request

def on_upstream_connection_close(self) -> None:
    pass

def handle_upstream_chunk(self, chunk: memoryview) -> memoryview:
    return chunk