abhinavsingh / proxy.py

💫 Ngrok FRP Alternative • ⚡ Fast • 🪶 Lightweight • 0️⃣ Dependency • 🔌 Pluggable • 😈 TLS interception • 🔒 DNS-over-HTTPS • 🔥 Poor Man's VPN • ⏪ Reverse & ⏩ Forward • 👮🏿 "Proxy Server" framework • 🌐 "Web Server" framework • ➵ ➶ ➷ ➠ "PubSub" framework • 👷 "Work" acceptor & executor framework
https://abhinavsingh.com/proxy-py-a-lightweight-single-file-http-proxy-server-in-python/
BSD 3-Clause "New" or "Revised" License
2.99k stars 574 forks source link

Support clients sending CONNECT requests without a HOST header field #5

Closed chijiao closed 5 years ago

chijiao commented 9 years ago

curl --proxy 127.0.0.1:1080 -v https://www.baidu.com

2015-07-22 23:13:53,431 - ERROR - pid:366 - Exception while handling connection <socket._socketobject object at 0x257abb0> with reason ValueError('need more than 1 value to unpack',) Traceback (most recent call last): File "/usr/bin/proxy.py", line 494, in run self._process() File "/usr/bin/proxy.py", line 479, in _process if self._process_rlist(r): File "/usr/bin/proxy.py", line 447, in _process_rlist self._process_request(data) File "/usr/bin/proxy.py", line 367, in _process_request host, port = self.request.url.path.split(COLON) ValueError: need more than 1 value to unpack 2015-07-22 23:13:53,433 - INFO - pid:366 - 127.0.0.1:38824 - CONNECT None:None

maxvyaznikov commented 9 years ago

From my laptop it works fine: $ python proxy.py --hostname 0.0.0.0 --port 3128 --log-level DEBUG and $ curl --proxy 127.0.0.1:3128 -v https://baidu.com

$ curl --version curl 7.38.0 (x86_64-pc-linux-gnu) libcurl/7.38.0 OpenSSL/1.0.1f zlib/1.2.8 libidn/1.28 librtmp/2.3 Protocols: dict file ftp ftps gopher http https imap imaps ldap ldaps pop3 pop3s rtmp rtsp smtp smtps telnet tftp Features: AsynchDNS IDN IPv6 Largefile GSS-API SPNEGO NTLM NTLM_WB SSL libz TLS-SRP

tuanchauict commented 6 years ago

No work if I use this for Android emulator

abhinavsingh commented 5 years ago

Unable to reproduce the same at my end:

$ curl -p localhost:8899 -v https://www.baidu.com
* Rebuilt URL to: localhost:8899/
*   Trying ::1...
* TCP_NODELAY set
* Connection failed
* connect to ::1 port 8899 failed: Connection refused
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8899 (#0)
> GET / HTTP/1.1
> Host: localhost:8899
> User-Agent: curl/7.54.0
> Accept: */*
> 
< HTTP/1.1 502 Bad Gateway
< Proxy-agent: proxy.py v0.3
< Content-Length: 11
< Connection: close
< 
* Closing connection 0
Bad Gateway* Rebuilt URL to: https://www.baidu.com/
*   Trying 45.113.192.102...
* TCP_NODELAY set
* Connected to www.baidu.com (45.113.192.102) port 443 (#1)
* ALPN, offering h2
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/cert.pem
  CApath: none
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server accepted to use http/1.1
* Server certificate:
*  subject: C=CN; ST=beijing; L=beijing; OU=service operation department; O=Beijing Baidu Netcom Science Technology Co., Ltd; CN=baidu.com
*  start date: May  3 01:48:02 2018 GMT
*  expire date: May 26 05:31:02 2019 GMT
*  subjectAltName: host "www.baidu.com" matched cert's "*.baidu.com"
*  issuer: C=BE; O=GlobalSign nv-sa; CN=GlobalSign Organization Validation CA - SHA256 - G2
*  SSL certificate verify ok.
> GET / HTTP/1.1
> Host: www.baidu.com
> User-Agent: curl/7.54.0
> Accept: */*
> 
< HTTP/1.1 200 OK
< Accept-Ranges: bytes
< Cache-Control: private, no-cache, no-store, proxy-revalidate, no-transform
< Connection: Keep-Alive
< Content-Length: 2443
< Content-Type: text/html
< Date: Wed, 17 Oct 2018 09:49:42 GMT
< Etag: "58860411-98b"
< Last-Modified: Mon, 23 Jan 2017 13:24:33 GMT
< Pragma: no-cache
< Server: bfe/1.0.8.18
< Set-Cookie: BDORZ=27315; max-age=86400; domain=.baidu.com; path=/
< 
<!DOCTYPE html>
<!--STATUS OK--><html> <head><meta http-equiv=content-type content=text/html;charset=utf-8><meta http-equiv=X-UA-Compatible content=IE=Edge><meta content=always name=referrer><link rel=stylesheet type=text/css href=https://ss1.bdstatic.com/5eN1bjq8AAUYm2zgoY3K/r/www/cache/bdorz/baidu.min.css><title>百度一下,你就知道</title></head> <body link=#0000cc> <div id=wrapper> <div id=head> <div class=head_wrapper> <div class=s_form> <div class=s_form_wrapper> <div id=lg> <img hidefocus=true src=//www.baidu.com/img/bd_logo1.png width=270 height=129> </div> <form id=form name=f action=//www.baidu.com/s class=fm> <input type=hidden name=bdorz_come value=1> <input type=hidden name=ie value=utf-8> <input type=hidden name=f value=8> <input type=hidden name=rsv_bp value=1> <input type=hidden name=rsv_idx value=1> <input type=hidden name=tn value=baidu><span class="bg s_ipt_wr"><input id=kw name=wd class=s_ipt value maxlength=255 autocomplete=off autofocus=autofocus></span><span class="bg s_btn_wr"><input type=submit id=su value=百度一下 class="bg s_btn" autofocus></span> </form> </div> </div> <div id=u1> <a href=http://news.baidu.com name=tj_trnews class=mnav>新闻</a> <a href=https://www.hao123.com name=tj_trhao123 class=mnav>hao123</a> <a href=http://map.baidu.com name=tj_trmap class=mnav>地图</a> <a href=http://v.baidu.com name=tj_trvideo class=mnav>视频</a> <a href=http://tieba.baidu.com name=tj_trtieba class=mnav>贴吧</a> <noscript> <a href=http://www.baidu.com/bdorz/login.gif?login&amp;tpl=mn&amp;u=http%3A%2F%2Fwww.baidu.com%2f%3fbdorz_come%3d1 name=tj_login class=lb>登录</a> </noscript> <script>document.write('<a href="http://www.baidu.com/bdorz/login.gif?login&tpl=mn&u='+ encodeURIComponent(window.location.href+ (window.location.search === "" ? "?" : "&")+ "bdorz_come=1")+ '" name="tj_login" class="lb">登录</a>');
                </script> <a href=//www.baidu.com/more/ name=tj_briicon class=bri style="display: block;">更多产品</a> </div> </div> </div> <div id=ftCon> <div id=ftConw> <p id=lh> <a href=http://home.baidu.com>关于百度</a> <a href=http://ir.baidu.com>About Baidu</a> </p> <p id=cp>&copy;2017&nbsp;Baidu&nbsp;<a href=http://www.baidu.com/duty/>使用百度前必读</a>&nbsp; <a href=http://jianyi.baidu.com/ class=cp-feedback>意见反馈</a>&nbsp;京ICP证030173号&nbsp; <img src=//www.baidu.com/img/gs.gif> </p> </div> </div> </div> </body> </html>
* Connection #1 to host www.baidu.com left intact

@tuanchauict Are you setting up Android Emulator to use proxy.py? I don't expect any issues but I'll try to reproduce a similar environment at my end.

@chijiao What environment did you try this out?

abhinavsingh commented 5 years ago

Repurposing this issue with my findings today. For posterity below is a list of events and findings from today:

  1. Thought to release proxy.py v0.3 on pypi today
  2. Already had proxy.py running on Macbook with the system configured to use proxy.py for all HTTP and HTTPS requests
  3. To adhere with latest pypi release process, I needed twine python package, which ran into the following issue:
    $ pip3 --proxy http://localhost:8899 --retries 0 install -v twine
    Collecting twine
    1 location(s) to search for versions of twine:
    * https://pypi.org/simple/twine/
    Getting page https://pypi.org/simple/twine/
    Looking up "https://pypi.org/simple/twine/" in the cache
    Request header has "max_age" as 0, cache bypassed
    Starting new HTTPS connection (1): pypi.org:443
    Could not fetch URL https://pypi.org/simple/twine/: connection error: HTTPSConnectionPool(host='pypi.org', port=443): Max retries exceeded with url: /simple/twine/ (Caused by ProxyError('Cannot connect to proxy.', timeout('timed out'))) - skipping
  4. Debugging proxy.py logs revealed that pip is sending CONNECT request without a Host header field. Example: CONNECT pypi.org:443 HTTP/1.0\r\n\r\n
  5. Above behavior doesn't conform with HTTP specifications as found in RFC 7231 and ietf-http-wg discussions
  6. Taking clues from this existing issue, further investigation revealed that similar behavior is also exhibit by Android emulators (thanks @tuanchauict for pointing this out). Example: CONNECT 172.217.160.238:443 HTTP/1.1\r\n\r\n
  7. @chijiao I believe curl version at your end also exhibited similar behaviour, also indicated by CONNECT None:None log line you provided

Even though such clients seem to not conform with HTTP specifications, I believe supporting this will be useful for the community, as several clients seem to exhibit the same behavior. Similar issue existed with Golang too https://github.com/golang/go/issues/18215 and was fixed later in https://go-review.googlesource.com/c/go/+/44004/