DNS tunnel that can do DoH and DoT

wkrp commented 4 years ago

dnstt is a new DNS tunnel that works with DNS over HTTPS and DNS over TLS resolvers, designed according to the Turbo Tunnel idea.

https://www.bamsoftware.com/software/dnstt/

git clone https://www.bamsoftware.com/git/dnstt.git

How is it different from other DNS tunnels?

It works with DNS over HTTPS (DoH) and DNS over TLS (DoT) resolvers, which makes it more difficult for a network observer to tell that a tunnel is being used.
It embeds a proper reliability and session protocol (KCP+smux). The client and server can send and receive simultaneously, and the client doesn't have to wait for one query to receive a response before sending the next query. Having multiple queries in flight helps with performance. (This is the Turbo Tunnel concept.)
It encrypts and authenticates the tunnel end to end, separately from the DoH/DoT encryption, using a Noise protocol.

.------.  |            .--------.               .------.
|tunnel|  |            | public |               |tunnel|
|client|<---DoH/DoT--->|resolver|<---UDP DNS--->|server|
'------'  |c           '--------'               '------'
   |      |e                                       |
.------.  |n                                    .------.
|local |  |s                                    |remote|
| app  |  |o                                    | app  |
'------'  |r                                    '------'

A DNS tunnel like this can be useful for censorship circumvention. Think of a censor that can observe the client⇔resolver link, but not the resolver⇔server link (the vertical line in the diagram). Traditional UDP-based DNS tunnels are generally considered to be easy to detect because of the unusual format of the DNS messages they generate—that, and the fact that every DNS message must be tagged with domain name of the tunnel server, because that's how the recursive resolver in the middle knows where to forward them. But with DoH or DoT, the DNS messages on the client⇔resolver are encrypted, so the censor cannot trivially see that a tunnel is being used. (Of course, it may still be possible to heuristically detect a tunnel based on volume and timing of the encrypted traffic—encryption alone doesn't solve that.)

I intend this software release to be a demonstration of the potential this kind of design for a tunnel. Currently the software doesn't provide a TUN/TAP network interface, or even a SOCKS or HTTP proxy interface. It only connects a local TCP socket to a remote TCP socket. Still, you can fairly easily set it up to work like an ordinary SOCKS or HTTP proxy, see below.

DNS zone setup

A DNS tunnel works by having the tunnel server act as an authoritative resolver for a specific DNS zone. The resolver in the middle acts as a proxy by forwarding queries for subdomains of that zone to the tunnel server. To set up a DNS tunnel, you need a domain name and a host where you can run the server.

Let's say your domain name is example.com and your host's IP addresses are 203.0.113.2 and 2001:db8::2. Go to your name registrar's configuration panel and add three new records:

A   tns.example.com points to 203.0.113.2
AAAA    tns.example.com points to 2001:db8::2
NS  t.example.com   is managed by tns.example.com

The tns and t labels can be anything you want, but the tns label should not be a subdomain of the t label (everything under that subdomain is reserved for tunnel payloads). The t label should be short because there is limited space in a DNS message, and the domain name takes up part of it.

Tunnel server setup

Run these commands on the server host; i.e. the one at tns.example.com / 203.0.113.2 / 2001:db8::2 in the example above.

cd dnstt-server
go build

First you need to generate crypto keys for the end-to-end tunnel encryption.

./dnstt-server -gen-key -privkey-file server.key -pubkey-file server.pub
privkey written to server.key
pubkey  written to server.pub

Now run the server. 127.0.0.1:8000 is the TCP address ("remote app" in the diagram above) to which incoming tunnelled stream will be forwarded.

./dnstt-server -udp :5300 -privkey-file server.key t.example.com 127.0.0.1:8000

The tunnel server needs to be reachable on port 53. You could have it bind to port 53 directly (-udp :53), but that would require you to run the server as root. It's better to run the server on a non-privileged port as shown above, and use port forwarding to forward port 53 to it. On Linux, these command will forward port 53 to port 5300:

sudo iptables -I INPUT -p udp --dport 5300 -j ACCEPT
sudo iptables -t nat -I PREROUTING -i eth0 -p udp --dport 53 -j REDIRECT --to-ports 5300
sudo ip6tables -I INPUT -p udp --dport 5300 -j ACCEPT
sudo ip6tables -t nat -I PREROUTING -i eth0 -p udp --dport 53 -j REDIRECT --to-ports 5300

You also need something for the tunnel server to connect to. It could be a proxy server or anything else. For testing, you can use an Ncat listener:

sudo apt install ncat
ncat -lkv 127.0.0.1 8000

Tunnel client setup

cd dnstt-client
go build

Copy server.pub (the public key file) from the server to the client. You don't need server.key (the private key file) on the client.

Choose a DoH or DoT resolver. There is a list of DoH resolvers here:

https://github.com/curl/curl/wiki/DNS-over-HTTPS#publicly-available-servers

And a list of DoT resolvers here:

To use a DoH resolver, use the -doh option:

./dnstt-client -doh https://doh.example/dns-query -pubkey-file server.pub t.example.com 127.0.0.1:7000

For DoT, use -dot:

./dnstt-client -dot dot.example:853 -pubkey-file server.pub t.example.com 127.0.0.1:7000

127.0.0.1:7000 specifies the client end of the tunnel. Anything that connects to that port ("local app" in the diagram above) will be tunnelled through the resolver and connected to 127.0.0.1:8000 on the tunnel server. You can test it using an Ncat client; run this command, and anything you type into the client terminal will appear on the server, and vice versa.

ncat -v 127.0.0.1 7000

How to make a standard proxy

You can make the tunnel work like an ordinary proxy server by having the tunnel server forward to a standard proxy server. I find it convenient to use Ncat's HTTP proxy server mode.

ncat -lkv --proxy-type http 127.0.0.1 3128
./dnstt-server -udp :5300 -privkey-file server.key t.example.com 127.0.0.1:3128

On the client, configure your applications to use the local end of the tunnel (127.0.0.1:7000) as an HTTP/HTTPS proxy:

./dnstt-client -doh https://doh.example/dns-query -pubkey-file server.pub t.example.com 127.0.0.1:7000
curl -x http://127.0.0.1:7000/ https://example.com/

I tried with Firefox connecting to an Ncat HTTP proxy through the DNS tunnel, and it works.

Local testing

If you just want to see how it works, without going to the trouble of setting up a DNS zone or a network server, you can run both ends of the tunnel on localhost. This way uses plaintext UDP DNS, so needless to say it's not covert to use a configuration like this across the Internet. Because there's no intermediate resolver in this case, you can use any domain name you want; it just has to be the same on client and server.

./dnstt-server -gen-key -privkey-file server.key -pubkey-file server.pub
./dnstt-server -udp 127.0.0.1:5300 -privkey-file server.key t.example.com 127.0.0.1:8000
ncat -lkv 127.0.0.1 8000

./dnstt-client -udp 127.0.0.1:5300 -pubkey-file server.pub t.example.com 127.0.0.1:7000
ncat -v 127.0.0.1 7000

When it's working, you will see log messages like this on the server:

2020/04/20 01:48:58 pubkey 0000111122223333444455556666777788889999aaaabbbbccccddddeeeeffff
2020/04/20 01:49:00 begin session 468d274a
2020/04/20 01:49:03 begin stream 468d274a:3

And this on the client:

2020/04/20 01:49:00 MTU 134
2020/04/20 01:49:00 begin session 468d274a
2020/04/20 01:49:03 begin stream 468d274a:3

Caveats

A DoH or DoT tunnel is covert to an outside observer, but not to the resolver in the middle. If the resolver wants to stop you from using a tunnel, they can do it easily, by not recursively resolving requests for the DNS zone of the tunnel server. The tunnel is still secure against eavesdropping or tampering by a malicious resolver, though; the resolver can deny service but cannot alter or read the contents of the tunnel.

For technical reasons, the tunnel requires the resolver to support a UDP payload size of at least 1232 bytes, which is bigger than the minimum of 512 guaranteed by DNS. I suspect that most public DoH or DoT servers meet this requirement, but I haven't done a survey or anything.

I haven't done any systematic performance tests, but I've done some cursory testing with the Google, Cloudflare, and Quad9 resolvers. With Google and Cloudflare I can get more than 100 KB/s download when piping files through Ncat. The Cloudflare DoH resolver occasionally sends a "400 Bad Request" response (the tunnel client automatically throttles itself when it sees an unexpected status code like that). The Quad9 resolvers seem to have notably worse performance than the others, but I don't know why.

wkrp commented 4 years ago

For technical reasons, the tunnel requires the resolver to support a UDP payload size of at least 1232 bytes, which is bigger than the minimum of 512 guaranteed by DNS. I suspect that most public DoH or DoT servers meet this requirement, but I haven't done a survey or anything.

As of tag v0.20200426.0 in the source code, dnstt-server lets you control the maximum UDP payload size with the -mtu option. The default is -mtu 1232. You can use -mtu 512 for maximum compatibility with resolvers, at the expense of bandwidth. If you know the resolver you are using supports larger UDP payloads, you can increase the value. The Cloudflare resolver supports -mtu 1452, for example; but I don't recommend going higher than that because you start to risk IP fragmentation. If you use an -mtu value that is larger than what the resolver supports, the tunnel won't work at all and you will get error messages in the server log that tell you what value to use.

wkrp commented 4 years ago

Download speed tests

I did some experiments of download performance of the DNS tunnel. tl;dr a DNS tunnel can go faster than you may think, but the choice of resolver matters a lot.

I tried downloading a 10 MB file through the tunnel, using a selection of resolvers and DNS transports. I cut off the download after 10 minutes. "none" is the special case of no intermediate recursive resolver (the tunnel client sends queries directly to the tunnel server). The server was located in Fremont, US and the client in Tokyo, JP. There was about 100 ms of latency between the two hosts. Download rates are the median of 5 trials. The dnstt tag was v0.20200430.0. See below for source code, data, pcaps, etc.

Cloudflare's DoH and DoT resolvers are both fast. Google's DoH resolvers is much faster than its DoT server (I noticed the DoT server terminating TCP connections every 200 KB or so). Comcast's DoH and DoT resolvers have about the same middling performance. Quad9's DoT resolver is notably slow; there's clearly something wrong there, whether it's the resolver or how the tunnel uses it. For comparison, the download rate of an untunneled, direct TCP transfer was 4666.3 KB/s.

resolver	transport	download rate
none	UDP	187.1 KB/s
Cloudflare	DoT	156.9 KB/s
Cloudflare	UDP	156.4 KB/s
Google	DoH	135.1 KB/s
Cloudflare	DoH	133.5 KB/s
Comcast	DoT	68.5 KB/s
Comcast	DoH	66.3 KB/s
Quad9	UDP	58.9 KB/s
Google	UDP	43.1 KB/s
PowerDNS	DoH	38.0 KB/s
Google	DoT	35.4 KB/s
Quad9	DoH	30.9 KB/s
Quad9	DoT	1.2 KB/s

I repeated the experiment with iodine, an existing DNS tunnel. iodine works over plaintext UDP only. dnstt is faster than iodine in every case, except for the Quad9 DoT resolver. It is possible to run iodine over a DoH proxy; I didn't try that myself but Sebastian Neef reports 4–12 KB/s when tunneling iodine through dnscrypt-proxy.

resolver	transport	download rate
none	iodine	14.6 KB/s
Google	iodine	1.8 KB/s
Cloudflare	iodine	1.4 KB/s
Quad9	iodine	0.3 KB/s

This graph shows the 5 trials under each experimental condition and gives an idea of the variance. Steeper lines are better.

dnstt-tests-20200430

The source code for these experiments is available in the following repo. I used git-annex to store the data files (there are over 3 GB of pcap files). You will have to git annex get the data files you want after git clone. The .csv files are sufficient to reproduce the graph. See procedure.txt for the commands to run to reproduce the experiment. .keylog files are TLS secrets in NSS Key Log Format; you can use these in Wireshark to decrypt the DoH and DoT streams.

git clone https://www.bamsoftware.com/git/dnstt-tests.git
cd dnstt-tests
git annex get 2020-04-30/*.csv
Rscript graphs.R

Also posted at https://www.bamsoftware.com/software/dnstt/performance.html.

Update 2020-05-05: I updated the tables and figure to exclude a preliminary test run that I did not intend to include in the first place. The change did not affect any of the qualitative observations. The Cloudflare/DoH case increased by about 7 KB/s from 126.7 KB/s to 133.5 KB/s; none of the other cases changed by more than 3 KB/s.

wkrp commented 4 years ago

Web page

I've set up a web page for dnstt: https://www.bamsoftware.com/software/dnstt/

And I wrote notes on the protocol: https://www.bamsoftware.com/software/dnstt/protocol.html

ValdikSS commented 4 years ago

This utility works during internet shutdown in Turkmenistan. It successfully establishes direct UDP connection to the destination server (without using any public resolver) and transfers up to 2 mbit/s of download. Iodine is very unstable in these conditions, while dnstt is stable and fast.

wkrp commented 4 years ago

Shadowsocks plugin (proof of concept)

It would not be hard to adapt the dnstt code for a Shadowsocks plugin. Here I show Bash scripts that wrap dnstt-client/dnstt-server in a Shadowsocks plugin interface: https://gist.github.com/wkrp/0712b87ab095dd0f77c56b02646060b7

A more portable/permanent solution would be to fork the dnstt code and swap the command-line interface for a Shadowsocks plugin environment variable interface.

ghost commented 4 years ago

A more portable/permanent solution would be to fork the dnstt code and swap the command-line interface for a Shadowsocks plugin environment variable interface.

IMO fork is even unnecessary, we can add SIP003 support without breaking command line interface. Just check if SIP003 environment variable exist when start, if they exist, then dnstt is running as plugin, command line can be ignored, else it's running independent, read command line normally. Same for other tunnel program.

wkrp commented 3 years ago

Performance tuning, v1.20210803.0

I just released v1.20210803.0 of dnstt.

https://www.bamsoftware.com/software/dnstt/#download
git clone https://www.bamsoftware.com/git/dnstt.git

The main feature of this release is some parameter tuning for a small improvement in performance in some configurations. See the full post.

I'm working on Champa, a circumvention tunnel based on AMP cache. Like dnstt, Champa uses a Turbo Tunnel model, with KCP and smux as an inner session layer. While working on Champa, I discovered that adjusting some buffer and window sizes could greatly improve download performance. I suggested that the same idea might improve performance in Snowflake, and I spent some time experimenting to see if it could help dnstt as well.

In summary, I was able to improve download speeds, but only in some configurations, and only a little bit. I was encouraged in initial tests with plaintext UDP and without a recursive resolver, which I was able to make go quite fast, even over 1 MB/s. But this is a configuration we don't care about, because it's not covert. In a recommended configuration with a recursive resolver and an encrypted transport, I was really only able to speed up Cloudflare/DoT, by about 25%.

I started by re-running the experiment with v0.20200430.0, the version used in the previous round of tests, in order to have a fresh basis of comparison. Since then, the Comcast/DoT server ceased operation, and Cloudflare/UDP went from one of the fastest configurations to the slowest. I repeated the experiment with v1.20210803.0, which has the performance tweaks.

resolver	transport	v0.20200430.0	v1.20210803.0	change
none	UDP	186.0 KB/s	332.5 KB/s	+78.7%
Google	DoH	132.7 KB/s	134.6 KB/s	+1.4%
Cloudflare	DoT	88.9 KB/s	112.8 KB/s	+26.9%
Cloudflare	DoH	98.2 KB/s	97.4 KB/s	−0.7%
Comcast	DoH	75.2 KB/s	72.7 KB/s	−3.3%
Google	UDP	57.7 KB/s	70.4 KB/s	+22.0%
PowerDNS	DoH	35.6 KB/s	34.9 KB/s	−2.2%
Quad9	DoH	20.7 KB/s	31.0 KB/s	+49.4%
Quad9	UDP	47.5 KB/s	22.2 KB/s	−53.3%
Google	DoT	44.2 KB/s	14.4 KB/s	−67.5%
Quad9	DoT	0.9 KB/s	1.6 KB/s	+86.2%
Cloudflare	UDP	0.9 KB/s	0.8 KB/s	−4.6%

The Google/DoT, Quad9/DoH, Quad9/UDP rows need some comment. In looking at the second-by-second download rates, we see that in 2 out of 3 trials, Google/DoT was initially going somewhat faster in the new version than in the old version, but then stalled and made no further progress. This was caused by a TCP disconnection (which itself is not unusual when using the Google DoT resolver) followed by a failure to reestablish the connection due to a name lookup error. This could be made more robust, but it does not really bear on bandwidth measurements. In the old Quad9/DoH and the new Quad9/UDP graphs, in 2 of the 3 trials there is a pattern of the download making progress, then stalling, then making progress, then stalling, and so on. I don't know what may be causing this phenomenon, except to guess that it may be rate limiting on a subset of backend server. In both cases, the 1 trial without the stop-and-start pattern has similar performance as in the corresponding graph.

dnstt-tests-20210802

As before, I've made the test code and raw data available, so you should be able to reproduce the table and graph, or run your own experiments. You will need git-annex to download a subset of the data files.

git clone https://www.bamsoftware.com/git/dnstt-tests.git
cd dnstt-tests/2021-08-02
git annex get data/*/*.csv
Rscript graphs.R

alexandervlpl commented 2 years ago

@wkrp Thanks for your excellent work on this. While testing I ran into a very annoying problem on the client side: if connectivity is lost when using DoH, the tunnel dies but does not exit:

sendLoop: Post "https://cloudflare-dns.com/dns-query": context deadline exceeded
sendLoop: Post "https://cloudflare-dns.com/dns-query": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
sendLoop: Post "https://cloudflare-dns.com/dns-query": net/http: request canceled (Client.Timeout exceeded while awaiting headers)

All subsequent connections through the tunnel hang. Tried Google and Cloudflare DoH.

The tunnel does recover when using DoT or UDP, so I'm assuming this is some limitation of DoH. Unfortunately DoT is very slow/unreliable for me, and UDP is "not recommended" for obvious reasons.

Is it possible to add some network resilience to dnstt-client? Just a graceful exit would be nice. Any plans to accept pull requests?

wkrp commented 2 years ago

@alexandervlpl Thanks for testing and for the useful report.

My guess is that it's not the sendLoop errors that are the source of the failure to reconnect. That's expected when there's a lack of connectivity.

Offhand, I don't have a idea as to why DoT recovers and DoH does not. In both cases, the local session should time out all streams after idleTimeout (2 minutes) without receiving any data from the server.

My first guess about what might be going wrong was that the server is timing out the client's session during the loss of connectivity, while the client still thinks it has a session and continues uselessly sending packets that the server thinks are not associated with any existing session. But the server's idleTimeout is 2 minutes as well, so the client should have stopped trying by that time.

How long did you wait after restoring the network connection? Try waiting 5 minutes. That should be enough to exceed any possible timeout. (The idleTimeout is nominally 2 minutes, but it can actually be up to 4 − ε minutes because of how smux checks the timeout only periodically.) If 5 minutes works, then we'll know what's going wrong and can try adjusting some parameters; if not, we'll know to look elsewhere. I encountered some pathology with DoT reconnection during the most recent round of big performance tests, and I investigated it a bit and made a note:

In -dot mode, if, after the TLS connection may become disconnected, the redial fails to connect, it results in "operation on closed connection" errors and a useless connection up until idleTimeout (2 to 4 minutes later), when the stream ends.

There is a separate subforum for dnstt. That's currently the best place to send patches. You're also welcome to keep posting on this thread if that's more convenient.

https://ntc.party/c/community-software/dnstt/33

alexandervlpl commented 2 years ago

I tried waiting 5, 10 minutes and longer. It looks like dnstt-client keeps trying to reuse the old connection(s) forever:

handle: session e261cadc opening stream: io: read/write on closed pipe

On Linux, switching off my network connection for a minute (or even less) is enough to reproduce this.

My workaround is a little bash function watchdog that restarts dnstt-client when it logs "Client.Timeout" or "closed pipe". That seems to work reliably, I'm currently using this every day on all my devices for small amounts of traffic.

Happy to post instructions and results on the forum when they're ready.

ghost commented 1 year ago

@wkrp @ValdikSS

I wonder if there is a simple guide for non-specialist people in order to implement DNSTT? And I also wonder if DNSTT needs any specific client on Windows and Android OS?

alexandervlpl commented 1 year ago

@alidxdydz on Android you can install the Termux app from Play or F-Droid and follow the same Linux instructions, this worked very well for me. Since this is Go it should compile and "just work" on Windows as well, can anyone confirm? There are no nice client apps I'm afraid.

wkrp commented 1 year ago

And I also wonder if DNSTT needs any specific client on Windows and Android OS?

Unfortunately there is no ready-to-use official mobile client app. If you search the Play Store for "dnstt" you will see some VPN apps come up. I cannot comment on their quality or trustworthiness, but I reverse engineered one once and found that they were using the upstream dnstt code, without any changes, such that an unmodified client could connect to the VPN (if you added in the extra layer of SSH authentication they were using).

net4people / bbs