shadowsocks / shadowsocks-rust

A Rust port of shadowsocks
https://shadowsocks.org/
MIT License
8.33k stars 1.14k forks source link

peculiar behavior of DNS resolver (MIGHT be a route issue but might be a shadowsocks DNS resolver issue) #1268

Open LindaFerum opened 1 year ago

LindaFerum commented 1 year ago

So, basically long story short I'm running Wireguard through Shadowsocks using the TUN feature (I know it's experimental)

It works splendidly, super fast and bypasses local internet filtering I have dnscrypt proxy running on 127.0.0.1 as UID 13 and excluded from Wireguard using route

sudo ip rule add uidrange 13-13 table 6969 priority 2

for some mystery reason using fwmark did not help excluding the DNS from wireguard but that one did (confirmed by dnscrypt's logs

Now, the problem: For some reason, shadowsocks can not reach 127.0.0.1 when wireguard is up, thus making connection restart fussy and problematic

Which is to say, when wireguard wg0 is already up and configured and all routes set, restarting shadowsocks like this works :

sudo /home/user/sslocaltun -vvvv -U --protocol tun --tun-interface-name tus0 -s MY.SERVER.S.IP:443 -m aes-128-gcm -k "my-fat-password" --udp-timeout 700 --tcp-keep-alive 100 --worker-threads 4 --dns udp://127.0.0.1:53 --outbound-bind-interface eth0

works great, everything flies!

BUT restarting shadowsocks like this DOES NOT WORK at all:

sudo /home/user/sslocaltun -vvvv -U --protocol tun --tun-interface-name tus0 -s serverdomain.com:443 -m aes-128-gcm -k "my-fat-password" --udp-timeout 700 --tcp-keep-alive 100 --worker-threads 4 --dns udp://127.0.0.1:53 --outbound-bind-interface eth0

it just keeps hopelessly trying to resolve the address and gets empty responses

eth0's IP is 10.137.5.17

I modified dnscrypt's config to listen both on 127.0.0.1:53 AND 10.137.5.17:9999

And then,

sudo /home/user/sslocaltun -vvvv -U --protocol tun --tun-interface-name tus0 -s serverdomain.com:443 -m aes-128-gcm -k "my-fat-password" --udp-timeout 700 --tcp-keep-alive 100 --worker-threads 4 --dns udp://10.137.5.17:9999 --outbound-bind-interface eth0

works like a charm

sudo ip rule show

0:  from all lookup local 
1:  from all lookup main suppress_prefixlength 0 #courtesy of wireguard's recommendation page
2:  from all uidrange 13-13 lookup 6969  #this one excludes DNS from wireguard
4:  from all fwmark 0x1e59 lookup 6977 # this one sends wireguard's connections to tus0 for processing by shadowsocks, works!
15: not from all fwmark 0x1e59 lookup 7777 # this one sends the rest of connections (local and forwarded) to wireguard, works! 
32766:  from all lookup main #didn't touch this one at all
32767:  from all lookup default 

The main table, for what it's worth, looks like

sudo ip route show table main

default via 10.137.5.1 dev eth0 
10.137.2.19 dev vif3.0 scope link metric 32749 
10.137.5.1 dev eth0 scope link 
10.225.0.0/24 dev tus0 proto kernel scope link src 10.225.0.1 

and local table looks like

sudo ip route show table local

local 10.68.92.182 dev wg0 proto kernel scope host src 10.68.92.182 
local 10.137.2.1 dev vif3.0 proto kernel scope host src 10.137.2.1 
local 10.137.5.17 dev eth0 proto kernel scope host src 10.137.5.17 
broadcast 10.225.0.0 dev tus0 proto kernel scope link src 10.225.0.1 
local 10.225.0.1 dev tus0 proto kernel scope host src 10.225.0.1 
broadcast 10.225.0.255 dev tus0 proto kernel scope link src 10.225.0.1 
broadcast 10.255.255.255 dev eth0 proto kernel scope link src 10.137.5.17 
broadcast 127.0.0.0 dev lo proto kernel scope link src 127.0.0.1 
local 127.0.0.0/8 dev lo proto kernel scope host src 127.0.0.1 
local 127.0.0.1 dev lo proto kernel scope host src 127.0.0.1 
broadcast 127.255.255.255 dev lo proto kernel scope link src 127.0.0.1 

For completeness's sake the rest of the custom tables are

sudo ip route show table 6969

default dev eth0 scope link

sudo ip route show table 6977

default dev tus0 scope link

sudo ip route show table 7777

default dev wg0 scope link

I may be misunderstanding something, but it would seem that when shadowsocks queries DNS on 127.0.0.1:53 the query should get routed by table local (priority 0) and connect allright

But it apparently does not

May be something broken in iptables (the system is old and gnarly and I'm still investigating) but putting an iptables rule to allow 127.0.0.1 from all interfaces sudo iptables -I INPUT -d 127.0.0.1 -j ACCEPT did not resolve the issue which suggests it's not some old iptables rule that is at fault here

I would very much appreciate if someone could try reproducing this and/or explain why it works the way it does (well at least it works whew :-) )

edited to add P.S.: this does not appear to be related related to tun functionality of shadowsocks I tested the following way:

I tried running a shadowsocks instance from homebrew without TUN feature and run OpenVPN through it using socks-proxy feature (which works cleanly) and when OpenVPN's tun0 is already up, I get exactly same "queries from shadowsocks to 127.0.0.1 go nowhere" behavior

So basically "run regulard sslocal as socks proxy before OpenVPN connects - DNS works, BUT if I try to restart Shadowsocks after tun0 and tun0-related routes are set up by OpenVPN, connection to 127.0.0.1 from shadowsocks goes nowhere

thus indicating whatever it is, it is not caused by tun device feature

zonyitoo commented 1 year ago

Did you try to use v1.16.0 (or compile from master). The trust-dns resolver in v1.15 doesn't support setting outbound-* options to DNS resolver's sockets. So the --outbound-bind-interface won't work for DNS resolver.

LindaFerum commented 1 year ago

Hm, reproduces on the version I recently compiled to get the -tun functions, so compiled from master.

--version shows shadowsocks 1.16.0

Also as far as I can understand, the resolver ignoring --outbound-bind-interface would cause an opposite problem (DNS server listening on same interface as specified by --outbound-bind-interface would not be reachable but the localhost / other if would be, however, the opposite is observed in my case)

chuxi commented 11 months ago

--outbound-bind-interface eth0

in linux, it set SO_BINDTODEVICE flag in socket connection. However, it is not supported by ipv6, which only works for ipv4. So you may get error network unreachable. use --outbound-fwmark xxx to redirect all packets instead.

LindaFerum commented 11 months ago

My DNS server listens on v4 (local) and the destination server is also v4 so the problem does not arise but I will keep that in mind