Create CDN compatible Websocket tunnels

grimpenmire commented 2 years ago

I've looked through existing issues, and I know the current view of the maintainers for using naiveproxy behind a CDN. However, I want to make a new argument for this.

For the past couple of months, I've been setting up and maintaining proxy servers for people in Iran (mainly v2ray based ones). The folks in Iran are in the rather unique and unfortunate position that they have their access to the global Internet shut down at critical times (like when there are mass protests, as there has been in the last two months).

Crucially, the data centers inside the country still have Internet access even when residential and mobile customers do not. So what we've been doing is setting up TLS based proxy servers and put them behind a CDN inside the country. This has been a saving grace for us, and that's how we've managed to keep people connected.

So I'm trying to see if this can be made to work with naiveproxy. I know naiveproxy uses CONNECT tunnels which are not supported by CDNs. So we need a workaround, like for example using an HTTP upgrade mechanism. I might want to try to get a stab at it myself if the maintainers are not interested in doing it, but I'd appreciate any pointers and ideas. I'm also interested to know if you'd still be against the idea given our use case.

klzgrad commented 2 years ago

not supported by CDNs

Do said CDNs support domain fronting? Or is it not the same class of usage affected by this https://en.wikipedia.org/wiki/Domain_fronting#Disabling ?

using an HTTP upgrade mechanism

Is this WebSocket or something else?

grimpenmire commented 2 years ago

This is not about domain fronting at all. We use our own domains. We just use the CDN to bypass the situation where only datacenters are connected to the global internet, while normal users only have access to a nation-wide intranet (which is also connected to the said CDN).

I'm thinking of using the same mechanism as Websocket (like v2ray does). That would work behind the CDN. Of course, that's just an idea, but it's the only thing I can thing of to make this work behind a CDN.

grimpenmire commented 2 years ago

The main thing is, the CDN does not support CONNECT. We just need something that works with a GET, POST, or something like that.

klzgrad commented 2 years ago

The often requested "CDN feature" here is about obfuscating the SNI, which is domain fronting.

The issue with your described idea is that CDNs would not welcome or it would not appear as a typical use case or common traffic behavior to have a long standing connection tunnel, whatever protocol it uses. I think I saw some papers at net4people that mentioned long connections are being unconditionally interrupted in Iran. Naiveproxy is really designed with the assumption that long connections would work. So this is the main mismatch.

The other issue is this C++ project costs much more to add features than a Go project, and the main feature of perfectly mimicking chrome net stack isn't really proven by evidence to be the most important thing once you are past the level of having a utls stack set up and verified. The more important and fruitful work right now is to have more sophisticated traffic shaping and this would happen much faster in Go than in C++.

grimpenmire commented 2 years ago

While they do abominable things with Internet traffic in Iran, other TLS based solutions have been working as well as one can expect under the circumstances, and our CDN hasn't been causing much of an issue so far. Obviously this is far from ideal, but we're trying to work with what we have.

The reason I've been looking at naiveproxy has been mainly that it's not in common use in Iran and if I get it to work, I might be able to have it as a backup solution, because we already predict even harder days to come.

Still, I understand what you're saying about this being more difficult to handle in the C++ codebase than in Go (even though I personally have much more experience in C++), and that you might not be interested in working on it. I might try and start working on a more sophisticated solution myself anyways, be it a naiveproxy fork, or something based on utls.

Thanks for the help. I'm also having another issue with my current naiveproxy server, but I'm gonna have to open another ticket for that.

klzgrad commented 2 years ago

If you're ready to put in the effort, I can give advices, review, and accept PRs.

First, need to minimize code change to minimize long term maintenance cost, so try to find best places to modify existing behaviors to support new use cases. In your case, try to abuse the https:// proxy scheme as an wss:// scheme (chrome can proxy a wss:// request, but cannot proxy stuff over a wss:// tunnel, which is what you're looking for). You can look at http proxy client socket (for h1 wss) and spdy proxy client socket (for h2 wss) and abuse them into dealing with upgrade headers instead of CONNECT headers. You can use proxy delegates to smuggle control data as headers with the proxy client socket so no API changes are needed.

grimpenmire commented 2 years ago

Having read the net4people post you mentioned in the other issue, I'm now not even sure if this is going to be worth trouble if they are going as far as blocking Chrome's TLS fingerprint entirely. I need to see how this further develops, but if it turns out they are not actually going to permanently block use of Chrome, I'll definitely come back to this. Appreciate all the help.

openips commented 2 years ago

Fellow this

grimpenmire commented 2 years ago

You are an expert in C++, you can study here.

Hardly. I've used C++ professionally for years of course, but I never call myself an expert. Anyways, I think I know enough C++ for this. What I don't know is the flow and structure of the code and this is a big codebase. For example, I don't even know how those functions you mentioned figure into this. Can you explain a bit more?

klzgrad commented 2 years ago

flow and structure of the code

https://source.chromium.org/chromium/chromium/src is immensely helpful in understanding large codebase. Try to click on functions to find back references.

grimpenmire commented 2 years ago

Thanks for the tips. So far I've managed to build naive with some logs put here and there to get a feel for things. I've read some of the stuff you sent and will read the rest later. Since I can't spend more than an hour or two per day on this, and that not everyday, I'm a bit slow, but hopefully I'll get there.

grimpenmire commented 2 years ago

Okay. I'm starting to slowly get the hang of this. I've actually got some nasty hacks that do work, at least with HTTP 1.1. I've got some issues with HTTP2 though and I need to debug. I was hoping I could get naiveproxy to dump a TLS key file to decrypt traffic in Wireshark. But looks like SSLKEYLOGFILE is not honored like it is in Chromium. Any (easy) way to make that work? Or otherwise take a look at what is actually sent on the wire?

grimpenmire commented 1 year ago

I'm almost (but not completely) certain at this point, that websocket over HTTP/2 (RFC 8441) is not supported by Cloudflare (and one other CDN I tested). One sign of that is Chrome itself chooses to use HTTP/1.1 for websockets when taking to a website behind CF, even though HTTP/2 is used for other content on the same website.

My trouble is that with the hacky approach I wanted to use, naiveproxy chooses HTTP/2 to talk to the server (since it supports it), and then the CDN does not like what happens next and sends a 400 error back. One option might have been to disable HTTP/2 on the CDN side, but apparently CF does not allow you to do that on the free plan.

I can think of two ways to get around this:

1) Write a whole new proxy that tunnels stuff through an actual websocket connection. This would be the nicest approach probably, and really not that hard if I wanted to write a separate proxy, but I'm not sure where to start to add that in this codebase.

2) Somehow force HTTP/1.1 for the proxy client-side. Not sure if this has (major) downsides or not. What do you think? Can you point me in the right direction on how this can actually be done?

klzgrad commented 1 year ago

I think you can set alpn to http/1.1 only somewhere.

grimpenmire commented 1 year ago

Wouldn't that change the expected TLS signature of Chrome?

grimpenmire commented 1 year ago

But then again, that's probably what Chrome itself does when it wants to force HTTP/1.1...

klzgrad commented 1 year ago

You can verify what chrome does by capturing the tls clienthello used for wss://.

Chrome added support for ws over h2 recently but there is some config that makes this not so easily turned on.

klzgrad commented 1 year ago

https://bugs.chromium.org/p/chromium/issues/detail?id=801564 reading this I think there is some server feature detection logic missing in naiveproxy. Should not require manual overriding of alpn.

grimpenmire commented 1 year ago

The main problem for me is that the CDN doesn't support it. In the browser, this is detected from h2 settings (I assume) and the browser switches back to 1.1. I assume that's how it works, because Chrome requires an existing H2 connection to the website, in order to use websocket-over-http2.

In naiveproxy however, we just detect h2 support, which is not the same as websocket-over-h2 support. And since I am trying to make a normal h2 stream look like websocket, it fails with the CDN.

Anyways, I'll take a look at the actual Chrome client hello, and if I see only HTTP/1.1 in ALPN, I might go on with that solution (if I can make it work, obviously!).

grimpenmire commented 1 year ago

Update: confirmed with wireshark. As expected, chrome sends a clienthello, with ALPN containing only http/1.1.

grimpenmire commented 1 year ago

Would you mind taking a look at this to see if it's an acceptable approach?

https://github.com/grimpenmire/naiveproxy/commit/3a8480c69c520bd685abf7091bb1200cdd414f01

It seems to be working with a quick backend I put together using python (which does not support padding protocol yet of course).

grimpenmire commented 1 year ago

Sure. Here goes.

#!/usr/bin/env python3

import socket
import select
from hashlib import sha1
from base64 import b64encode
from http.server import HTTPServer, BaseHTTPRequestHandler, ThreadingHTTPServer

class MyHandler(BaseHTTPRequestHandler):
    def do_GET(self):
        if self.headers.get('upgrade', '').lower() != 'websocket':
            self.return_camouflage()
            return

        if self.headers.get('connection', '').lower() != 'upgrade':
            self.return_camouflage()
            return

        key = self.headers.get('sec-websocket-key')
        if not key:
            self.return_camouflage()
            return

        connect_host = self.headers.get('x-connect-host')
        if not connect_host:
            self.return_camouflage()
            return

        self.send_response(101)

        self.send_header('Upgrade', 'websocket')
        self.send_header('Connection', 'Upgrade')

        accept = key + '258EAFA5-E914-47DA-95CA-C5AB0DC85B11'
        accept = sha1(accept.encode('ascii')).digest()
        accept = b64encode(accept).decode('ascii')
        self.send_header('Sec-Websocket-Accept', accept)

        self.end_headers()

        sock = self.connect(connect_host)
        if sock is None:
            return

        sock.setblocking(False)
        self.request.setblocking(False)

        while True:
            ready, _, _ = select.select([self.request, sock], [], [])

            if self.request in ready:
                chunk = self.rfile.read()
                if not chunk:
                    break
                sock.sendall(chunk)

            if sock in ready:
                chunk = sock.recv(1400)
                if not chunk:
                    break
                self.wfile.write(chunk)

    def return_camouflage(self):
        page = b'<html><body>foobar</body></head>'
        self.send_response(200)
        self.send_header('Content-Type', 'text/html')
        self.send_header('Content-Length', len(page))
        self.end_headers()
        self.wfile.write(page)

    def connect(self, hostname):
        port = 80
        if ':' in hostname:
            host, port = hostname.split(':')
            port = int(port)

        sock = None
        for res in socket.getaddrinfo(host, port, socket.AF_UNSPEC,
                                      socket.SOCK_STREAM):
            af, socktype, proto, canonname, sa = res
            try:
                sock = socket.socket(af, socktype, proto)
            except OSError:
                continue

            try:
                sock.connect(sa)
            except OSError:
                sock.close()
                continue

            break

        return sock

def main():
    server = ThreadingHTTPServer(('localhost', 2000), MyHandler)
    server.serve_forever()

if __name__ == '__main__':
    main()

openips commented 1 year ago

You can push your pr

grimpenmire commented 1 year ago

I need to see if I can make a better backend (hopefully an updated forwardproxy or something like that, though I've got no experience with golang), make sure paddings work, and then send in a PR.

klzgrad commented 1 year ago

Can this faux websocket transit through CDNs with handshakes but without websocket framing? In the current design the faux websocket requires client and server side to create the opposite faux websocket with framing only but without handshakes. This is kind of inconvenient for integration with other proxy systems.

grimpenmire commented 1 year ago

I've tested it with Cloudflare, and it does work. We could add websocket framing, but I'm not sure if that's gonna be worth the trouble, since it would make the client more complex, and we'd still need a custom backend, to make it work like a CONNECT tunnel.

So this seems to be the minimum implementation that allows travel through a CDN, unless of course other CDNs actually parse whole websocket streams, but I think that's unlikely.

grimpenmire commented 1 year ago

How would this make integration with other proxies inconvenient? I'm not sure I follow that part?

klzgrad commented 1 year ago

It's not an RFC conforming implementation of wss so it would be confusing to use the name. Client and server network libraries would expect a web socket with framing. You can check v2ray and see if they use this websocket handshake only tls socket or full websocket. Without a conforming implementation it could be problematic for interoperability.

If this is really needed, it will have to use a separate name, wss-handshake:// or something.

openips commented 1 year ago

this PR can not work with forwardproxy, a new backend application should be added. It's a lot of coding work .

klzgrad commented 1 year ago

I don't see what is difficult with modifying forward proxy for this as it's only header logic.

openips commented 1 year ago

I don't see what is difficult with modifying forward proxy for this as it's only header logic.

nice .waiting for new forward .thks the greate tools

grimpenmire commented 1 year ago

It's not an RFC conforming implementation of wss so it would be confusing to use the name. Client and server network libraries would expect a web socket with framing. You can check v2ray and see if they use this websocket handshake only tls socket or full websocket. Without a conforming implementation it could be problematic for interoperability.

If this is really needed, it will have to use a separate name, wss-handshake:// or something.

This I agree with. Calling it something other than "wss" makes more sense. A complete websocket implementation is imo too much of a hassle for this purpose. And the backend might be beyond me with my current golang knowledge (or lack thereof!). The header logic I might be able to handle in golang; I'm yet to take an actual look at it though. Life has thrown a few wrenches in my way, and time has become even more scarce atm!

grimpenmire commented 1 year ago

What are those logs from? Looks like whatever server you're using is actually trying to parse the tunnel contents as websocket, which is obviously not going to work.

grimpenmire commented 1 year ago

Ah, okay. Makes sense then. Anyways, this is because we are not implementing websocket framing. Can you think of a case where that actually matters or causes a problem? Because otherwise, as I said before, I think that's just needless complication.

grimpenmire commented 1 year ago

I haven't gotten to that part yet (I'm being real slow, I know!), but I imagine I need to do that using a custom header since Proxy-Authorization is a hop-by-hop header and the CDN will probably strip it. The custom header will be translated back by a caddy middleware. I'm just spending whatever little time I have learning and tinkering with golang, because it's just too much fun! :smile:

grimpenmire commented 1 year ago

I don't see how encrypting the password is going to help with anything. In this case, the encrypted password would be the password to use then, and there's be no practical difference. Anyone having that encrypted password (which is visible to the CDN) can use it to connect to the proxy.

As to merging the python logic with the C++ code, I'm not sure I understand this. How can the server logic be moved to the client?

grimpenmire commented 1 year ago

Okay. I think I understand now what you mean by using http_proxy_socket.cc. I would have rather modified forwardproxy, if only for my own learning, but I encountered two issues. First I tried creating a middleware, but the middleware can only (easily) change the request, I need to also change the response, and that is either not possible when calling forwardproxy's ServeHTTP, or at least only possible by passing a custom ResponseWriter.

I then tried modifying forwardproxy itself, but the response seems to be weirdly missing three bytes at the beginning (I can only force curl to look at it by passing --http0.9). There's also two sets of headers sent! No idea what's happening so far.

Anyways, I might try your method when I get a little bit of time.

As to CDN seeing the credentials, I don't think there's any easy way of preventing that. Simply encoding the credentials with another key is not enough, since the CDN can use the encoded value as easily. We'd need some sort of challenge-response protocol in-place to prevent that, which is definitely outside the scope of what I'm doing.

klzgrad commented 1 year ago

So what we've been doing is setting up TLS based proxy servers and put them behind a CDN inside the country.

If it is a CDN inside the country, what stops the censor from controlling the CDN?

If the censor does not control the CDN, the credentials leaking to the CDN is not part of the threat model.

grimpenmire commented 1 year ago

There's really nothing stopping them from controlling the CDN. We're actually not sure what the deal is. It could be incompetence (which is likely), or that they don't care to close all the holes, or that the folks working at the CDN provider wanting to help others by letting this loophole work. FWIW, they've recently started sending "fair use" warnings for all my servers, so maybe the good days are numbered.

Anyways, I don't see a reasonable way of hiding things from the CDN in a simple proxy like naive. They could get us if they really want to, and then we'd just have to find another way. The good ol' cat and mouse game.

grimpenmire commented 1 year ago

If you mean by a malicious CDN, no idea. If by other third-parties, the path should be enough, right? And we could or could not have the username/password, or we could just send that as the path.

But frankly, I'm a little stuck with the backend. No idea why my modified forwardproxy doesn't work (probably something stupid I did, but still!). You pointed out before something about being able to use naiveproxy as forwardproxy, but that doesn't seem to work for me either, even without wss mode (using the haproxy config as a frontend). haproxy forwards stuff to naive and then nothing is sent until connection times out.

grimpenmire commented 1 year ago

Not sure I understand the context. Is this with the python backend? If so, do you see any errors on the backend side? Does it happen with any web page you visit?

grimpenmire commented 1 year ago

This is very strange indeed. There are some extra spaces in those logs, but I'm assuming those are related to something with copying the logs. Does the python server print any errors while this happens? I haven't seen anything like this happen to me.

FWIW, I'm starting to doubt if I am going to actually finish this. My modified golang server (caddy/forwardproxy) does weird things and I can't quite understand why. Also I'm not even sure we came up with an acceptable solution for the credentials issue either.

grimpenmire commented 1 year ago

I'm giving my golang backend one last try. I finally managed to implement it in a caddy middleware, which is nice. But it doesn't work! Inspecting the traffic using wireshark, I see an extra set of headers are sent after the 101 response. This extra response has a 200 status, but contains all the websocket headers we add in the middleware (it's also chunked, and I have no idea who does that). Some logging proves that the code to add the websocket headers is only called once.

Anyways, the extra response obviously breaks everything. If you could take a look at it and see if you can spot any obvious issues, I'd be very grateful. You can see the changes here: https://github.com/klzgrad/forwardproxy/compare/naive...grimpenmire:forwardproxy:wss

I build this using xcaddy build --with github.com/caddyserver/forwardproxy@caddy2=. and then run caddy using this Caddyfile:

{
  auto_https disable_redirects
  order forward_proxy before file_server
  order wss_handshake_tunnel before forward_proxy
}
:443, grimpmie.xyz {
  tls /root/.acme.sh/grimpmie.xyz/grimpmie.xyz.cer /root/.acme.sh/grimpmie.xyz/grimpmie.xyz.key
  log {
    output stderr
  }
  wss_handshake_tunnel
  forward_proxy {
    basic_auth user pass
    hide_ip
    hide_via
    probe_resistance
  }
  file_server {
    root /var/www/html
  }
}

grimpenmire commented 1 year ago

The unknown opcodes are expected, and as far as I understand, they should be completely harmless. There is no entity in between that actually attempts to parse the websocket protocol. Wireshark obviously does that of course, and that's why I just look at the raw data there.

As to occasional freezes, the only thing I can think of is that the Python server I put together in a few minutes is probably far from an ideal server. For example, it's multithreaded, which is something that's generally to be avoided in Python. It's also likely that there are some situations that are not handled as gracefully as they should.

Anyways, I'm sure the current issue with the golang backend will not be fixed even if we add appropriate websocket framing. There is clearly an extra set of headers there which would cause trouble in any case.

grimpenmire commented 1 year ago

Sec-Websocket-Key is not for authentication. It's just a mechanism to prevent certain kinds of abuse, and also possibly prevent badly behaving HTTP caches from caching the content.

klzgrad commented 1 year ago

an extra set of headers are sent after the 101 response. This extra response has a 200 status, but contains all the websocket headers we add in the middleware (it's also chunked, and I have no idea who does that

Looks like you need to locate who is creating this 200 reply to proceed.

3aaber commented 1 year ago

Must of CDN providers, have problem with connecting to upstream with HTTP/2, is there any solution for naiveproxy to be compatible with this?

triggered96 commented 1 year ago

Must of CDN providers, have problem with connecting to upstream with HTTP/2, is there any solution for naiveproxy to be compatible with this?

I have given up wss, because it is not perfect, and compared with connect, it is unstable and the flow is severe. It feels that the connect connection is already very satisfied, which also confirms the reason why the author @klzgrad has never added WSS because it is unnecessary at present.

grimpenmire commented 1 year ago

Yeah, turned out to be too much of a trouble (not that I've spent a lot of time on it in the past couple of weeks!). Thanks for all the help folks.

3aaber commented 1 year ago

What about grpc? some CDN providers like cloudflare have the ability for connecting to upstream (proxy server) on gRPC. something like this:

User <---HTTP/2 ---> CDN <---- gRPC ----> Proxy Server < ---- > Free Internet

klzgrad / naiveproxy

Create CDN compatible Websocket tunnels #390