net4people / bbs

Forum for discussing Internet censorship circumvention
3.19k stars 75 forks source link

Is it possible to implement a man-in-the-middle (MITM) tool to bypass censorship? #373

Open nlifew opened 6 days ago

nlifew commented 6 days ago

I spent a few time to think about it. Some talented developers have dedicated a lot of effort to obfuscating network traffic, making it appear as normal Chrome traffic. So why don't we implement an MITM tool to act as a true reverse proxy?

Here's my very simple idea: This tool listens for local SOCKS connections. Whenever a connection is made, it establishes a TLS communication with the client using a self-signed certificate and then forwards the network requests to the reverse proxy server (this step can use a trusted certificate). From any perspective, it completely avoids the issue of TLS-over-TLS.

Obviously, this tool is best suited for browsers. Firstly, the traffic characteristics of this tool entirely depend on the behavior of the SOCKS client, and browsers are naturally the best option. Secondly, browsers are the most friendly to self-signed certificates, whereas other apps do not trust self-signed certificates at all.

So, is this possible? Do dynamically generated elements within web pages (such as CAPTCHAs) support reverse proxies? I'm an outsider in this field and would appreciate hearing your opinions.

mmmray commented 5 days ago

do I understand it right, you want to strip the TLS from user traffic before transmission, and renegotiate it on the server?

you can do this if you:

This is just off the top of my head. It seems to me this is fine for personal use but should probably not be widely promoted. But it's nice to solve TLS-in-TLS over CDN!

nlifew commented 4 days ago

do I understand it right, you want to strip the TLS from user traffic before transmission, and renegotiate it on the server?

you can do this if you:

  • trust the server to not become compromised and leak user traffic
  • trust yourself to not become lazy and disable SSL verification instead of configuring a new root CA. This would be catastrophic if you disable VPN and the apps keep trying to connect. In browsers this is fine but in mobile apps, this is more tricky. On android, you can use apk-mitm to try your idea, but it's not secure.
  • don't care about QUIC (it's usually better to block HTTP3 anyway)
  • still forward ALPN of the removed TLS correctly

This is just off the top of my head. It seems to me this is fine for personal use but should probably not be widely promoted. But it's nice to solve TLS-in-TLS over CDN!

Yes, you are correct. This approach is essentially more like a packet capturing tool: it impersonates the server to handshake with the client and impersonates the client to handshake with the remote server. It decrypts all encrypted traffic.

To censors, this appears from every perspective as genuine HTTP requests. However, there are significant limitations: it is suitable only for and limited to browsers; it requires an uncensored and controllable server with a trusted certificate. If any of these conditions are not met, there is a risk of privacy leakage. Additionally, support for HTTP/3 needs to be considered.

This is not a universal solution but rather positioned as a "last resort" option to navigate through particularly strict censorship periods.

klzgrad commented 3 days ago

For the record: https://mailarchive.ietf.org/arch/msg/httpbisa/x5K6Bgoj4x-zzoKu4b-LCSMO5us/ https://isc.sans.edu/diary/Explicit+Trusted+Proxy+in+HTTP20+ornot+so+much/17708

wkrp commented 2 days ago

It's not a totally impossible idea. If the MITM proxy is trusted and run by you, on your own computer, there's not necessarily any loss of confidentiality or integrity. It's more brittle than straightforward end-to-end TLS, in the sense that it's easier to make a mistake that results in a loss of security, but if you're very careful it can probably be done right. You might run into some problems with certificate pinning.

Some circumvention tools have done local MITM. GoAgent, the first or one of the first domain fronting tools, domain-fronted HTTP and HTTPS requests through Google App Engine. Unlike, say, meek, GoAgent did not use App Engine as a simple conduit to a trusted remote proxy; it exited traffic directly from the App Engine servers. To make that work, GoAgent needed to be able to tell the App Engine server exactly what URL to fetch, and for that to work with HTTPS requests, GoAgent needed to be able to decrypt the TLS and parse the HTTP request inside. GoAgent installed a local trusted root certificate authority and MITMed all requests passing through the proxy.

I think this is basically what you have sketched, @nlifew. GoAgent wasn't doing it for the sake of a TLS fingerprint (which didn't matter as much in those days), but so that the local proxy could do domain fronting, because that is something that browsers cannot do themselves.

I know this because GoAgent had severe security bugs in its MITM implementation that exposed users to actual MITM attacks. Basically, every user had the same trusted private key for the trusted root "GoAgent" certificate authority, and upstream TLS connections were not properly validated.

The GoAgent CA certificate is used to do a local (intentional) man-in-the-middle of HTTPS connections between the browser and proxy.py. GoAgent works by encoding HTTP requests received by proxy.py and sending them to gae.py, where gae.py makes the encoded request. gae.py then encodes the HTTP response and sends it back to proxy.py, where it is decoded and returned to the browser. In order for GoAgent to work with HTTPS sites, it needs to undo the encryption so that gae.py will know what URL to request. When proxy.py receives a CONNECT request (meaning an HTTPS site is requested), it generates and serves a fake certificate signed by the GoAgent CA. From the user's point of view, all HTTPS sites are verified by "GoAgent". In some browsers, certificate pinning prevents the GoAgent technique from working for a small number of sites. (A consequence of GoAgent's model is that HTTPS is not end-to-end. It is HTTPS between the user and App Engine, and HTTPS between App Engine and the web site, but App Engine gets to see the plaintext.)

I don't know the details, but Lantern at one point may also have used local MITM for domain fronting without an additional proxy hop or protocol overhead, which they called "direct domain fronting". I'm not sure whether or to what extent they still do it.

https://www.bamsoftware.com/papers/fronting/#sec:deploy-lantern-direct

Direct domain fronting

The Lantern network includes a geolocation server. This server is directly registered on the CDN and the Lantern client domain-fronts to it without using any proxies, reducing latency and saving proxy resources. This sort of direct domain fronting technique could in theory be implemented for any web site simply by registering it under a custom domain such as facebook.direct.getiantem.org. It could even be accomplished for HTTPS, but would require the client software to man-in-the-middle local HTTPS connections between browser and proxy, exposing the plaintext not only to the Lantern client but also to the CDN. In practice, web sites that use the CDN already expose their plaintext to the CDN, so this may be an acceptable solution.

Somewhat related, if you build a database of which domain names use which CDNs, and the IP addresses of CDN edge servers, you can send domain-fronted requests to a CDN edge server appropriate for each domain name. In this special case, you don't need any MITM or local proxy; the CDN edge server is effectively the proxy. This is what CacheBrowsing and CDNBrowsing do.