yarrick / iodine

Official git repo for iodine dns tunnel
https://code.kryo.se/iodine
ISC License
6.26k stars 508 forks source link

iodine performance improvements and features #16

Open frekky opened 9 years ago

frekky commented 9 years ago

Overview

This fork of iodine was intended primarily to improve performance by using a TCP-like sliding window protocol for having multiple "in flight" fragments both upstream and downstream. This allows greatly increased performance on high-latency connections. In order to do so, the whole data/ping structure has been changed (details available in doc/proto_00000800.txt).

Some limited testing has been conducted, the results of which can be found in the updated man page.

This has been almost fully tested on Linux amd64 and compiles without warnings, however no other platforms have been tested yet. Due to some hacks to get millisecond timer precision on Windows - see windows.h for gettimeofday() and struct timeval macros - various functionality may not work as expected.

Unit tests have been updated to suit changes to the main code base, and a basic sliding window test was created which tests some of the essential functions.

Issues

This fork is still in development, and I plan to keep it up to date with the main iodine repository as much as possible. There are probably lots of currently undiscovered bugs and certainly lots of problems with intolerant DNS servers which cause performance and connectivity issues.

To help diagnose these problems, I strongly recommend that you try -V 5 to print connection statistics such as the number of queries per second, fragments lost, failures, timeouts, round-trip time etc.

Most of the important feature additions are listed here.

I may have forgotten to mention some features here, but this should cover most of them.

Protocol Overview

Due to the nature of the sliding window protocol, the entire data transfer protocol needed to be rewritten. The new protocol (800) is detailed in the docs, and although the basic DNS encapsulation is the same, the headers have been more-or-less completely rearranged. Upstream and downstream are functionally equivalent at the sliding window layer, where new data packets (ie from tun device on either client or server) are treated as follows:

  1. Data is optionally compressed (depending on user-specific upstream/downstream compression flags)
  2. Raw or compressed data is then split into a number of fragments depending on the user's maximum fragsize (calculated beforehand during the handshake process)
  3. Each fragment is added to the outgoing window buffer (same for both downstream and upstream) and assigned a unique sequence ID from 0 to 255. The window buffer maintains a pointer to the current fragment which is the "start" of the sending window, and while sending fragments, only the windowsize number of fragments are sent in order from the fragment at the start of the window.
  4. The fragments are sent in order from the start of the window as described above.
  5. When the fragments are received at the other end, they are placed in the receiving window buffer at an offset determined by their sequence ID. This way, out-of-order fragments (very common with load-balanced DNS servers) can be easily handled without dropping them.
  6. The receiving end will check if it has received both the starting fragment, the final fragment and all the in-between fragments and if it has, the full data packet is retrieved and the pointer to the start of the next received chunk is moved forwards by the number of fragments.
  7. The received full packet is optionally uncompressed and sent to the tun device.
  8. The receiving end immediately ACKs the fragment using its sequence ID using either a ping or a data packet (both have space for an ACK).
  9. When the ACK is received at the sending side for a fragment, it is marked off in the sending buffer as having been successfully received by the other end and based on this the window can be moved forward and the next few fragments sent.

    Other Information

Any other information is available in the code (I've put in a reasonable amount of hopefully helpful comments so it shouldn't be too hard to understand).

Feel free to ask any questions or make comments on any of the changes. I've done quite a lot of refactoring to clarify various parts of the code or make things simpler.

Thanks for all the great work in making something like iodine, and thanks again for making it open source. It's truly been a pleasure working with it and I hope to be able to contribute something to this project.

Anime4000 commented 9 years ago

I tried compile your version under MinGW64, no luck, I test on mu linux virtualbox_ubuntu 14 04 3 64-bit _12_11_2015_17_52_43

What I see, your version iodine not smooth, but can achieve higher downstream...

is possible to host multiple topdomain? let say server has multiple WAN IP.

electricarrows0 commented 9 years ago

you should of added ticket bytes and stuff to increase downstream, that way more queries about let's say its split and sent to each ticket then you can query faster right? send it all like that like data.1.domain.com like that data.2.domain.com and basically like that and added the answers as ip addresses like 3 numbers each . all as valid ip addresses but not valid in a query as a way to get around the iodine queries getting blocked i had the queries blocked fast it was unbelievable that i could not connect to my server,just as a experiment because it probably would not get done on the official git

frekky commented 9 years ago

@electricarrows0 There are lots of new options which can change the behaviour of the program. Perhaps the defaults are a little ridiculous (with a windowsize of 8, sometimes up to 8 queries per second while idle with low DNS server timeouts). To reduce the number of timeouts and other errors that cause connectivity issues, try something like iodine -w 1 -W 1 -I 1, and if you're interested in more details try using -V 5 to get a useful statistics report every 5 seconds.

In case you were using an old iodine command, the iodined topdomain goes first now, followed by any number of DNS servers (used in round-robin). If you specify multiple dns servers you may reduce the load on each server, which may help if your DNS servers drop queries under high load. I also recommend using iodined -c in case you have BADIP errors, since using multiple DNS servers will most likely lead to different source addresses being seen from iodined.

@Anime4000 As soon as I have time I'll get a MinGW build environment set up and make sure Windows compatibility is working properly.

If you're having issues with DNS servers behaving strangely or producing lots of errors, try using the above mentioned options (ie. reducing the upstream/downstream window size and the target timeout) and use -V to show more stats which will be helpful if you wish to find the best values for certain options (probably target timeout -I or up/downstream window sizes). If possible, try setting all the "Fine tuning options" and the "options to try if connection doesn't work" manually to low values and increase them while testing to see if you get better results.

Finally, in terms of connection stability, use the -V option to see your connection round-trip time and try setting the downstream fragment timeout -j to something smaller which can help if you have low ping times normally but frequent large spikes (this is probably caused by packet loss - again check the stats using -V to see all that useful information.

At this point in time, connecting to multiple iodined domains would be quite tricky, unless they were all using the same iodined server (in which case it would be relatively simple). I'll consider adding that as a feature in this fork later. Using multiple iodined servers would be harder since the client would have to login and somehow load balance between all of them at the IP level (not likely feasible).

electricarrows0 commented 9 years ago

@frekky well i'm saying it adjusts as the downstream or upstream increases by creating ticket bytes or to have more users on the same dns ip and all other stuff

frekky commented 9 years ago

@electricarrows0 I'm not completely sure about what you're suggesting but if it was to increase the number of pending queries when more downstream data is available and use only a single query when idle, that would be quite a useful feature to reduce DNS load.

In terms of having multiple users connected to iodined using the same internal IP (such as on the tun device), the issues with load balancing would be quite tricky to handle especially without separating TCP connections etc. The purpose of using a sliding window protocol in this case was to prevent the need of using multiple iodine connections at once (and load balancing between them) and use only a single client with higher throughput.

If you're having issues with certain types of requests being blocked, try changing the DNS type to something else like SRV, CNAME, MX or A (using iodine -T option).

Could you elaborate on what you mean by "ticket bytes"?

Thanks

electricarrows0 commented 9 years ago

@frekky from here http://heyoka.sourceforge.net/ http://heyoka.sourceforge.net/heyoka-shakacon2009.pdf you might find the ticket bytes thing it basically means like for a slave server but even better not to do that and instead send extra queries is to just use it for speeding up the connection by extra question queries

electricarrows0 commented 9 years ago

@frekky you could add spoofing to protect the dns tunnel from getting detected

Anime4000 commented 9 years ago

iOS - iodine 0.6 under cellular network img_0025 at my end, most stable around 24KB/s (3G, H+ signal)

back yo Ubuntu, using this fork, I can get more speed (-w 128 -W 128 & no compression), when wget 4MB test file (half-way), iodine stop responding... :trollface:

I found out using CloudFlare DNS management is bad idea, I setup dns server & iodine in same server.

Under CloudFlare: 2KB/s ~ 5KB/s Under self hosted DNS: 24KB/s ++

yarrick commented 9 years ago

This looks interesting, I will review it when I get the time!

Anime4000 commented 9 years ago

@frekky file err.h is OpenSSL include\openssl\err.h file? my mingw64 didn't have this file, can you send err.h ?

frekky commented 9 years ago

Turns out there already was a fix for those, I'd just forgotten to not include err.h when compiling for win32.

Anime4000 commented 9 years ago

@frekky try check src/iodined.cat line 31? it should be:

#ifndef WINDOWS32
 #include <err.h>
#endif
Anime4000 commented 9 years ago

@frekky to be more awesome, natively add route :+1: currently I use script to do that

SERVER_IP → GATEWAY_IP (Direct VPN) DNS_IP → GATEWAY_IP (DNS Tunnel)

so, Iodine wont exit when VPN sessions under it

iOS script, capture current gateway & dns

route -n add -net $SERVER_IP $GW_IP
route -n add -net $DNS_IP $GW_IP

Windows script, capture gateway ip

route add %SERVER_IP% mask %MASK% %GW_IP%
route add %DNS_IP% mask %MASK% %GW_IP%
cpatulea commented 9 years ago

Builds on Mac OS X Yosemite 10.10.15 x64.

One compile warning; CC client.c client.c:574:25: warning: comparison of unsigned expression >= 0 is always true [-Wtautological-compare] if (send_query_sendcnt >= 0 && send_query_sendcnt < 100 &&

cpatulea commented 9 years ago

With 'make debug', looks like Clang doesn't support -Og:

$ make debug OS is DARWIN, arch is x86_64 CC tun.c error: invalid integral value 'g' in '-Og' error: invalid integral value 'g' in '-Og'

cpatulea commented 9 years ago

This has also uncovered a host of format string warnings:

window.c:83:3: warning: format specifies type 'long' but the argument has type 'int' [-Wformat] WDEBUG("Resizing window buffer with things still in it! This will cause problems!"); ^~~~~~~~~~~~~~~~~~~ ./window.h:65:3: note: expanded from macro 'WDEBUG' TIMEPRINT("WINDOW-DEBUG ", FILE, LINE);\ ^~~~~~~~~~~~ ./common.h:90:55: note: expanded from macro 'TIMEPRINT' fprintf(stderr, "%03ld.%03ld ", currenttime.tv_sec, currenttime.tv_usec / 1000);\


window.c:147:4: warning: format specifies type 'long' but the argument has type 'int' [-Wformat]
                        WDEBUG("Dropping frag with seqID %u: not in window (%u-%u)", f->seqID, startid, endid);
                        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./window.h:65:3: note: expanded from macro 'WDEBUG'
                TIMEPRINT("[WINDOW-DEBUG](%s:%d) ", **FILE**, **LINE**);\
                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./common.h:90:55: note: expanded from macro 'TIMEPRINT'
                fprintf(stderr, "%03ld.%03ld ", currenttime.tv_sec, currenttime.tv_usec / 1000);\
                                       ~~~~~                        ^~~~~~~~~~~~~~~~~~~~~~~~~~
cpatulea commented 9 years ago
ssh <server_ip> cat /dev/urandom: 1.94 MiB/s
iodine <server_ip> (raw mode): 1.87 MiB/s, iodined CPU ~27%
iodine -r <server_ip> (
  DNS mode,
  Switching upstream to codec Base128,
  Switching server options: lazy mode, downstream codec Raw, compression enabled...,
  Setting downstream fragment size to max 1186...
  Determined round-trip time of 183 ms, server timeout of 4817 ms
): 400 KiB/s, iodined CPU ~8%
iodine -r <isp_dns_ip>:
  Opened utun0
Opened IPv4 UDP socket
Sending DNS queries for <domain> to <isp_dns_ip>
Using DNS type TXT queries
Version ok, both using protocol v 0x00000800. You are user #1
Setting IP of utun0 to 10.0.53.3
Adding route 10.0.53.0/24 to 10.0.53.3
add net 10.0.53.0: gateway 10.0.53.3
Setting MTU of utun0 to 1130
Server tunnel IP is 10.0.53.1
Skipping raw mode
Using EDNS0 extension
Switching upstream to codec Base128
Server switched upstream to codec Base128
Autodetecting downstream codec (use -O to override)
Switching server options: lazy mode, downstream codec Raw, compression enabled...
Switched server options successfully. (rlc)
Autoprobing max downstream fragment size... (skip with -m fragsize)iodine: Got FORMERR as reply: server does not understand our request
.iodine: Got FORMERR as reply: server does not understand our request
.iodine: Got FORMERR as reply: server does not understand our request
.768 not ok.. iodine: Got FORMERR as reply: server does not understand our request
.iodine: Got FORMERR as reply: server does not understand our request
.iodine: Got FORMERR as reply: server does not understand our request
.384 not ok.. iodine: Got FORMERR as reply: server does not understand our request
.iodine: Got FORMERR as reply: server does not understand our request
.iodine: Got FORMERR as reply: server does not understand our request
.192 not ok.. iodine: Got FORMERR as reply: server does not understand our request
.iodine: Got FORMERR as reply: server does not understand our request
.iodine: Got FORMERR as reply: server does not understand our request
.96 not ok.. iodine: Got FORMERR as reply: server does not understand our request
.iodine: Got FORMERR as reply: server does not understand our request
.iodine: Got FORMERR as reply: server does not understand our request
.48 not ok.. iodine: Got FORMERR as reply: server does not understand our request
.iodine: Got FORMERR as reply: server does not understand our request
.iodine: Got FORMERR as reply: server does not understand our request
.24 not ok.. iodine: Got FORMERR as reply: server does not understand our request
.iodine: Too many error replies, not logging any more.
..12 not ok.. ...6 not ok.. ...3 not ok.. ...2 not ok.. iodine:
found no accepted fragment size.
iodine: try setting -M to 200 or lower, or try other -T or -O options.

So my ISP DNS somehow rejects new protocol. Tested master, it still works.

Anime4000 commented 9 years ago

In mingw64 (GCC 5.1) it print a lot of error, iodine server cannot compile. Compile Log Here!

frekky commented 9 years ago

@Anime4000 You will need Git CLI installed and available in the system path. Make sure you can run "git" in a normal command prompt before trying to build. Alternatively, in src/Makefile, change the line

HEAD_COMMIT = `git rev-parse --short HEAD`

to

HEAD_COMMIT = "iodine git"
Anime4000 commented 9 years ago

@frekky nice! I just tested on DigitalOcean (Ubuntu Server) <> Home (Windows Client) I can get ~224KB/s, it almost 2mbps by using default primary & secondary DNS provided by ISP

don't use cloudflare DNS, I tried it, cloudflare block multiple request, need self hosted DNS like BIND9

Anime4000 commented 8 years ago

I have suggestion, can you add --preset for high speed and low latency mode?

frekky commented 8 years ago

@Anime4000 Good idea! Only problem is that I don't really know what works as "high speed/low latency" considering that depends entirely on your internet connection. Could you perhaps post what works for you under various conditions?

Anime4000 commented 8 years ago

What I tested, I use -M 100 -w 128 -W 128 and I get nearly 2mbps, sacrifice alot packet drop. Tried -w 256 -W 256, no data...

maybe --preset, make iodine negotiate which is ideal for -wW or any args for high bandwidth/throughput.

for low latency ideal for Messenger or Voice, I often use on smartphone.

Anime4000 commented 8 years ago

@frekky is possible to add localhost listener for non-root access/without tun device?

frekky commented 8 years ago

@Anime4000 I was planning to add something like that when I get time. Probably would be more like SSH ProxyCommand compatibility where data is from stdin/stdout, although doing something similar to SSH -L or -R options would also be useful. At any rate it makes sense to let something like SSH handle compression, encryption and data transfer (ie using SOCKS proxy with -D or local/remote forwards, even as a tun device) rather than trying to implement the same manually, so I'll most likely just implement the stdin/stdout pipe functionality with the server end connecting to a specified local port.

This also would reduce overhead and increase throughput significantly since SSH would then be able to run without the IP or TCP overhead in the tun device and would leave all flow control to iodine itself.

frekky commented 8 years ago

@Anime4000 There is now a --preset or -Y option, so that way it's easier to use appropriate values for various situations without having to find out what all the options mean. At this point the presets aren't configurable except by modifying iodine.c.

Would you be able to test with window sizes less than 32?

As it turns out, the server actually doesn't process more than 31 pending requests (unless you've changed QMEM_LEN in server.h to a higher number). I'd be interested to see if you get any performance boosts from that.

Anime4000 commented 8 years ago

@frekky I tried latest commit and not working, no download stream received, but... using old commit work just flawlessly iodine-frekky-old-better

What change between old & new commit?

frekky commented 8 years ago

@Anime4000 Quite a lot changed, now working towards adding iodine ProxyCommand mode and modifying some parts of the protocol. Stick to using an old commit if it worked better since it might be a while before it works (at all) again.

Anime4000 commented 8 years ago

Can you make preset for old commit? Also... Possible open more Opened IPv4 UDP socket instead of 3 to increase speed? Another... -m is value after minus 6 or before? 1176-6=1170

UPDATE: After trying new commit, I need find right -M value, currently using 250 8 1000meg test

frekky commented 8 years ago

@Anime4000 One of the reasons so much has changed is to introduce presets with a global client/server "instance".

Opening more UDP sockets wouldn't be useful at all I'm afraid. It would be better to run multiple instances of iodine simultaneously and load-balance between them somehow, although that was entirely the reason I modified iodine in the first place (to avoid doing that).

-m sets the max downstream fragsize which including the 6 or so bytes used for the header, so the actual size of encoded data is -m value minus size of the packet header (probably more than 6 bytes anyway).

Anime4000 commented 8 years ago

for -m when after saw best value 1176-6=1170, next time I put -m 1176 right? One more thing, after I try modify value QMEM_LEN from 48 ~ 128, no benefit, having speed drop, so, I think most DNS, 32 is best effort.

yipperr commented 8 years ago

@Anime4000 the speeds your reporting are really impressive are you sure your using iodine with the isp dnsserver as a relay as opposed to a direct udp connection to your digital ocean server running iodined If this is indeed through the isp dnsserver being used as relay that is some serious speed

Anime4000 commented 8 years ago

Run script I use is:

@echo off
@title frekky modification :)
cd "%~dp0"

:: This var useful when do route add later on
SET DNS1=isp dns
SET DNS2=isp dns

SET HOST=server ip
SET MASK=255.255.255.255

for /f "tokens=2,3 delims={,}" %%a in ('"WMIC NICConfig where IPEnabled="True" get DefaultIPGateway /value | find "I" "') do set GATE=%%~a

:: Run baby!
iodine -frP notpassword -M 250 -w 32 -W 64 -c 0 -C 0 x1.hitoha.moe %DNS2% %DNS1%

timeout /t 3

DNS ISP: i run android terminal and do getprop net.dns1 & getprop net.dns2 Domain provider: dynadot.com DNS Record: namecheap freedns Iodine: DigitalOcen (Singapore)

I test under my phone as USB Tethering, only HSPA+ or 4G/LTE works best

frekky commented 8 years ago

At this point it is worth noting that the last stable and working commit was ac60bf6. Some more new features are currently under development and it's about half finished, so many things don't quite work yet. EDIT: last working commit was a bit earlier than that one, sorry. ac60bf6 does indeed compile.

yipperr commented 8 years ago

@frekky @Anime4000 thanks for the info

yipperr commented 8 years ago

@frekky i'm curious are you able to reproduce the same speeds as anime4000 in his test (over 2mbps - 4mbps) from the screenshots he posted. i can't help but wonder that iodine is resorting to a direct udp connection to the iodined dns server he is running to get these speed. by default iodine checks if the firewall of the isp allows a direct connection before falling to dns relay (i believe in this case his isp just happens to allow this )

Anime4000 commented 8 years ago

@yipperr I check WireShark, it really use DNS query :+1: need -r to avoid direct, I test many ISP at my end, ISP on my phone has very good DNS, also Land-line ISP I use right now has very bad DNS, you can try 1.9.1.9

However, you need adjust -M and do download test, find right value, it may different each ISP DNS and length of your domain, try get very short domain like good.io

yipperr commented 8 years ago

@Anime4000 if it was using direct udp connection to your iodined server the packets in wireshark will still show up as dns packets but since you mentioned you'r using the -r flag and get different speeds with changing dnsserver you seem to be be running on dns relay with the local dns server just out curiosity does your mobile isp allow you to open a udp socket on port 53 to any server besides the dnserver of the isp ? when you have zero balance and no mobile plan activated

Anime4000 commented 8 years ago

@yipperr I don't know what kind mobile ISP setup, in my country, all mobile provider want to steal customer credit, so everyone has HSPA+ or 4G/LTE, these signal always on no matter what.

my setup is: android phone →usb tethering→ computer

call iodine to use mobile ISP DNS, primary & secondary (To know DNS IP, execute getprop net.dns1 & getprop net.dns2 in Android Terminal)

data should be like this: phy(iodine(dante(data))) dante is SOCKS5 proxy

yipperr commented 8 years ago

@Anime4000 what is the speed you are getting from the dns server from your land line isp ? i only managed to get 23KB maximum on my home broadband line did mobile dns servers give you the most speed ? (4mbps)

Anime4000 commented 8 years ago

@yipperr speed on my land line is 88KB/s to 40KB/s, sometime can go 100KB/s with this args: -M 100 if I use 110 and above, no data

meanwhile cellular (HSPA+) can max as ~7.2Mbps (~800KB/s) with this args: -M 200 if I use 201 and above, Internet Download Manager wont start :cry:

@frekky Can you add function to check right value -M for specific DNS IP, I had to test from 255 until 100 to just find right value... maybe do 1MB download null test, might look like this:

iodine --test 1.9.1.9 --test-size 2M --test-timeout 3
Testing 1.9.1.9 with 255 upstream hostnames 0b/s
Testing 1.9.1.9 with 254 upstream hostnames 0b/s
...
Testing 1.9.1.9 with 200 upstream hostnames 123KB/s
...
Best upstream hostnames for 1.9.1.9 is 200, use -M 200 on this network

Also what upstream hostnames do? sending 255 hostname at the time or 255 character long for hostname ?

frekky commented 8 years ago

@Anime4000 The upstream hostname is the full domain name that iodine sends as a DNS query, such as paabbb123hhajqwolk.mydnstunnel.com which includes the topdomain specified to both iodine and iodined, the encoded iodine protocol headers and the encoded data.

Changing the -M option sets the maximum length of the full domain name used in upstream DNS requests and it seems that some DNS servers are fussy about longer domain names.

At some point I'll work on adding some way of autodetecting various options which affect latency and throughput, which would also have to include determining which options work with certain DNS servers. DNS query type, upstream/downstream encoding and downstream fragsize are already probed during normal startup so it would be possible to test more options there.

Anime4000 commented 8 years ago

@frekky this mean, get shorter domain name is the best option right? allow more payload size...

justwilliambrown commented 8 years ago

@Anime4000 which version are you using to get those speeds?

Anime4000 commented 8 years ago

@justwilliambrown Latest commit, you need adjust -M mostly value -M 100 work just fine, higher mean faster. Also depend how ISP setup DNS and how busy DNS is

frekky commented 8 years ago

@Anime4000 Yes, shorter domain names means that there is more space for data, which is especially useful for DNS servers that only allow hostnames up to 100 bytes or below.

yipperr commented 8 years ago

@Anime4000 you mentioned you where using bind9 on your server for managing the dns records locally and then you seem to have moved to freedns from name cheap which was the faster option you experienced ? bind9 or freedns from namecheap

Anime4000 commented 8 years ago

@yipperr I use freedns from namecheap, Put BIND9 & Iodine in same server likely cause an error, better put BIND9 & Iodine in different server

yipperr commented 8 years ago

@Anime4000 thankyou

yipperr commented 8 years ago

@frekky @Anime4000 in both of your test does your isp allow you to setup a tunnel with the iodined server directly ? with the new sliding windows protocol rewrite by frekky and with an isp that allows direct connection to an outside ip speaking the new protocol would be significant latency and speed increase