microsoft / WSL

Issues found on WSL
https://docs.microsoft.com/windows/wsl
MIT License
16.95k stars 798 forks source link

Split DNS problems #11321

Open leorg99 opened 4 months ago

leorg99 commented 4 months ago

Windows Version

Microsoft Windows [Version 10.0.22631.3155]

WSL Version

2.1.4.0

Are you using WSL 1 or WSL 2?

Kernel Version

5.15.146.1-2

Distro Version

Debian GNU/Linux 12 (bookworm)

Other Software

No response

Repro Steps

.wslconfig

[wsl2]
firewall=false
networkingMode=mirrored
memory=16GB 
processors=4
swap=4GB
nestedVirtualization=false
guiApplications=false

[experimental]
sparseVhd=true

/etc/wsl.conf:

[boot]
systemd=true

Expected Behavior

When connected to VPN (Cisco AnyConnect), I expect the VPN nameservers to be listed first followed by LAN nameserver in /etc/resolv.conf

Actual Behavior

The actual generated /etc/resolv.conf is:

nameserver 192.168.1.1
nameserver vpn nameserver1
nameserver vpn nameserver2
search host1 host2 host3 etc

When the VPN has an internal domain that also exists publicly, name resolution fails because it tries the LAN/public nameserver first. When I rearrange the nameserver order so that the VPN nameservers are first, everything gets resolved properly.

The Windows host already appears to do this when the VPN adapter (Ethernet3) is connected:

InterfaceAlias               Interface ConnectionSpecificSuffix ConnectionSpecificSuffix RegisterThisConn UseSuffixWhen
                             Index                              SearchList               ectionsAddress   Registering
--------------               --------- ------------------------ ------------------------ ---------------- -------------
Ethernet 3                          22 core.dc                  {}                       True             False
Local Area Connection* 1             7                          {}                       True             False
Local Area Connection* 2            17                          {}                       True             False
Wi-Fi                                5                          {}                       True             False
Loopback Pseudo-Interface 1          1                          {}                       True             False
vEthernet (Default Switch)          27                          {}                       False            False

Diagnostic Logs

No response

github-actions[bot] commented 4 months ago

Hi I'm an AI powered bot that finds similar issues based off the issue title.

Please view the issues below to see if they solve your problem, and if the issue describes your problem please consider closing this one and thumbs upping the other issue to help us prioritize it. Thank you!

Closed similar issues:

Note: You can give me feedback by thumbs upping or thumbs downing this comment.

leorg99 commented 3 months ago

I think rearranging the name servers breaks other things. I am not sure what is the correct solution. Most of the internal VPN domains resolve properly except for ones ending in .internal. They resolve on the Windows host with the VPN adapter, but not wsl2 with Debian.

leorg99 commented 3 months ago

In the end, I made it work using dnsmasq.

With networkingMode=mirrored and VPN connected (via Windows Cisco AnyConnect), my resolv.conf looked like:

nameserver 192.168.1.1
nameserver vpn nameserver1
nameserver vpn nameserver2
search host1 host2 host3 etc

This worked for the most part except for .internal, which seemed to use go through 192.168.1.1 and not the corp vpn DNS servers. I also had an issue with docker run --rm -it alpine:edge apk update failing to resolve dl-cdn.alpinelinux.org, which went away when I disconnected from VPN.

With networkingMode=mirrored and dnsTunneling=true and trying different combinations of useWindowsDnsCache and bestEffortDnsParsing, I had all kinds of weird DNS issues. This was even less usable than with just networkingMode=mirrored .

In the end, I installed dnsmasq. My /etc/dnsmasq.conf looks like:

# anything without dots in it doesn't get forwarded to DNS
domain-needed
no-resolv
bogus-priv

# only bind to the localhost IP
listen-address=127.0.0.53

# specifically do not provide any other service than DNS
no-dhcp-interface=lo

# only bind to the interface(s) listed above
# default is to bind to all interfaces even though you specified only one above
bind-interfaces

# enable DNS Cache and adjust cache-size
cache-size=1000

log-queries

server=8.8.8.8                   # resolve default
server=/foo/10.125.40.51 # resolve foo with corp dns1
server=/foo/10.125.40.52 # resolve foo with corp dns2
server=/bar/10.125.40.51
server=/bar/10.125.40.52

I updated /etc/wsl.conf:

[boot]
systemd=true
[network]
generateResolvConf = false

and changed /etc/resolv.conf to:

nameserver 127.0.0.53

I forgot if I had to enable the dnsmasq service but it would be

sudo systemctl enable dnsmasq

If it was already running, you have to restart it to load the changed config file:

sudo systemctl restart dnsmasq

I tried to bind dnsmasq to interface=lo and even though I could clearly see dnsmasq was the only process listening to 127.0.0.1:53 and ::1:53, something was blocking or filtering connections to this address. Binding to 127.0.0.53 was the only way to make it work. Could be related to #11312.

Everything now seems to resolve as expected.

chanpreetdhanjal commented 3 months ago

Please follow the networking diagnostic script. https://github.com/microsoft/WSL/blob/master/CONTRIBUTING.md#collect-wsl-logs-for-networking-issues

It should create a folder like [WslNetworkingLogs-date_ver.zip]

CatalinFetoiu commented 3 months ago

hi @LeorGreenberger. thanks for reporting the issue

when collecting the logs Chanpreet asked for, please have dnsTunneling set to true and the dnsmasq tool disabled/uninstalled, so that we can troubleshoot why dnsTunneling did not work for your scenario. thanks

CatalinFetoiu commented 3 months ago

we had a different report of split DNS issues with DNS tunneling - https://github.com/microsoft/WSL/issues/10680, which was root caused and we are working to fix.

the root cause could be similar here

CatalinFetoiu commented 2 months ago

we have made a fix for DNS tunneling split DNS problems (fixing https://github.com/microsoft/WSL/issues/10680). Can you please take KB5036980 published in https://support.microsoft.com/en-us/topic/windows-11-version-22h2-update-history-ec4229c3-9c5f-4e75-9d6d-9025ab70fcce, install it and see if the issue still reproduces?

Please make sure to have DNS tunneling enabled when testing.

If you still encounter issues after installing the KB, please attach networking logs. thanks