AmmarRahman / wsl-vpn

Apache License 2.0
138 stars 15 forks source link

fedora: resolv.conf rewrite doesn't work #8

Closed xasx closed 3 years ago

xasx commented 3 years ago

Description

When I start the script under fedoraremix, I get an error that the pipe is not found. This results in the resolv.conf not being rewritten properly. When using a Debian (where everything works), the resolv.conf inside the fedoraremix is also not rewritten.

Steps to reproduce

with Debian as the bridge

edit: see below: conversion to WSL 2 was missing

  1. Install Debian WSL
  2. Follow the instructions and install wsl-vpn into Debian, start it
  3. Switch back and forth VPN to confirm that it is working
  4. Start fedoraremix wsl
  5. Switch back and forth VPN to see that it is not working there, resolv.conf remains the same

without Debian

edit: this happened after conversion in Debian as well

  1. Meet the prerequisites in fedora since the setup script does not work (maybe I missed something?)
  2. Run the start script, the output shows an error
$ sudo ~/winhome/Git/wsl-vpn/wsl-vpnkit-start.sh
successfully created TAP device eth1
starting in connect mode with path=/var/run/wsl-vpnkit.sock and tap=eth1
RTNETLINK answers: File exists
wsl-vpnkit.exe: [INFO] Setting handler to ignore all SIGPIPE signals
wsl-vpnkit.exe: [INFO] Version is 218f01482560cba2fa863f9ad872ad51d1e717fc
wsl-vpnkit.exe: [INFO] System SOMAXCONN is 2147483647
wsl-vpnkit.exe: [INFO] Will use a listen backlog of 32
wsl-vpnkit.exe: [INFO] No periodic Gc.compact enabled
wsl-vpnkit.exe: [WARNING] There is no database: using hardcoded network configuration values
wsl-vpnkit.exe: [INFO] DNS server configured with no builtin DNS names; everything will be forwarded
wsl-vpnkit.exe: [INFO] 2 upstream DNS servers are configured
wsl-vpnkit.exe: [ERROR] While watching /etc/resolv.conf: ENOENT
wsl-vpnkit.exe: [INFO] Disabling transparent HTTP redirection
wsl-vpnkit.exe: [INFO] Updating resolvers to use host resolver
wsl-vpnkit.exe: [INFO] Secure random number generator is available
wsl-vpnkit.exe: [INFO] Add(3): DNS configuration changed to: use host resolver
wsl-vpnkit.exe: [INFO] DNS server configured with builtin DNS names [ gateway.internal -> 192.168.67.1, host.internal -> 192.168.67.2, vm.internal -> 192.168.67.3 ]
wsl-vpnkit.exe: [INFO] Will use the host's DNS resolver
wsl-vpnkit.exe: [INFO] New Gateway forward configuration: []
wsl-vpnkit.exe: [INFO] Configuration server_macaddr = f6:16:36:bc:f9:c6; max_connection = None; dns_path = None; dns = ; resolver = Host; domain = None; allowed_bind_addresses = 0.0.0.0; gateway_ip = 192.168.67.1; host_ip = 192.168.67.2; lowest_ip = 192.168.67.3; highest_ip = 192.168.67.14; dhcp_json_path = None; dhcp_configuration = None; mtu = 1500; http_intercept = None; http_intercept_path = None; port_max_idle_time = 300; host_names = host.internal; gateway_names = gateway.internal; vm_names = vm.internal; udpv4_forwards = []; tcpv4_forwards = []; gateway_forwards_path = None; pcap_snaplen = 128
wsl-vpnkit.exe: [INFO] C:\Windows\System32\drivers\etc\hosts file has bindings for
2021/09/07 11:34:39 open //./pipe/wsl-vpnkit: The system cannot find the file specified.
EOF reading from socket: closing

Failed to read hello from client
Failed to negotiate vmnet connection
wsl-vpnkit.exe: [INFO] Vmnet.Server.negotiate: received { magic = VMN3T; version = 22; commit =  }
Server reports version 22, commit 0123456789012345678901234567890123456789
wsl-vpnkit.exe: [INFO] Generated UUID on behalf of client: fc2e1290-83ed-45c4-8701-48af18399cdb
wsl-vpnkit.exe: [INFO] Vmnet.Server.negotiate: received Ethernet fc2e1290-83ed-45c4-8701-48af18399cdb
wsl-vpnkit.exe: [INFO] Vmnet.Server.negotiate: sending { mtu = 1500; max_packet_size = 1550; client_macaddr = 02:50:00:00:00:01 }
wsl-vpnkit.exe: [INFO] Vmnet.Server.listen: rebinding the primary listen callback
wsl-vpnkit.exe: [INFO] Vmnet.Server.listen: starting event loop
wsl-vpnkit.exe: [INFO] Connected Ethernet interface f6:16:36:bc:f9:c6
VMNET VIF has MAC 02:50:00:00:00:01wsl-vpnkit.exe: [INFO] Client mac: 02:50:00:00:00:01 server mac: f6:16:36:bc:f9:c6

wsl-vpnkit.exe: [INFO] TCP/IP ready
wsl-vpnkit.exe: [INFO] TCP/IP stack connected

Expected behavior

in fedoraremix

Actual behavior

in fedoraremix

# This file was automatically generated by WSL. To stop automatic generation of this file, add the following entry to /etc/wsl.conf:
# [network]
# generateResolvConf = false
nameserver 192.168.208.1

Debian (w/o VPN)

# This file was automatically generated by WSL. To stop automatic generation of this file, add the following entry to /etc/wsl.conf:
# [network]
# generateResolvConf = false
nameserver 192.168.178.1
nameserver fd00::ca0e:14ff:fe4d:f321
nameserver fec0:0:0:ffff::1
search fritz.box
xasx commented 3 years ago

Here is a list of WSLs for simple installation: https://wsldl-pg.github.io/docs/Using-wsldl/#distros

Using Debian as a bridge, been unsuccessful so far with:

Not sure, maybe an init system issue, however it should only be relevant in Debian which starts the service.

xasx commented 3 years ago

I didn't convert Debian to WSL 2. After doing so, I get the same error from the script:

...
wsl-vpnkit.exe: [ERROR] While watching /etc/resolv.conf: ENOENT
...
wsl-vpnkit.exe: [INFO] C:\Windows\System32\drivers\etc\hosts file has bindings for
2021/09/09 15:51:48 open //./pipe/wsl-vpnkit: The system cannot find the file specified.
EOF reading from socket: closing

Failed to read hello from client
Failed to negotiate vmnet connection
...
xasx commented 3 years ago

I am closing this issue.

It is still not working with VPN switched on, but has probably less to do with resolv.conf rewriting.

andyneff commented 3 years ago

We do not re-write the resolv.conf file anymore, it is no longer needed and there are more complications when it comes to keeping that file "re-written" (https://github.com/andyneff/wsl2-dns-search is a different project where I do constantly re-write it)

I don't think fedora uses "resolv" (no e, as in resolv.conf) but uses "resolve" (with e) now by default. As I'm new to "resolve", I barely know what this means.

There may be a way to change this using nsswitch.conf

andyneff commented 3 years ago

Sorry for the late reply.

I actually use fedora remix with Ubuntu (pretty much the same as Debian) as a bridge, and it works. So that is odd.

[root@kaku andy]# cat /etc/os-release
NAME="Fedora Remix for WSL"
VERSION="34"
ID=fedoraremixforwsl
ID_LIKE=fedora
VERSION_ID=34
PLATFORM_ID="platform:f34"
PRETTY_NAME="Fedora Remix for WSL"
ANSI_COLOR="0;34"
CPE_NAME="cpe:/o:fedoraproject:fedora:34"
HOME_URL="https://github.com/WhitewaterFoundry/Fedora-Remix-for-WSL"
SUPPORT_URL="https://github.com/WhitewaterFoundry/Fedora-Remix-for-WSL"
BUG_REPORT_URL="https://github.com/WhitewaterFoundry/Fedora-Remix-for-WSL/issues"
PRIVACY_POLICY_URL="https://github.com/WhitewaterFoundry/Fedora-Remix-for-WSL/blob/master/PRIVACY.md"
FEDORA_REMIX_VERSION=34.5.6
[root@kaku andy]# curl ifconfig.me
71.***.***.***[root@kaku andy]#
[root@kaku andy]# curl ifconfig.me

108.***.***.***[root@kaku andy]#
[root@kaku andy]# curl ifconfig.me
71.***.***.***[root@kaku andy]#

I jumped on and off of VPN and it continued to work.


And yes, both my ubuntu and fedoraremix are on WSL 2


Here is my Ubuntu resolv.conf

# This file was automatically generated by WSL. To stop automatic generation of this file, add the following entry to /etc/wsl.conf:
# [network]
# generateResolvConf = false
nameserver 172.22.160.1

Which is pretty much identical to my fedoraremix resolv.conf


My hosts config on fedora remix

[root@kaku andy]# grep ^host /etc/nsswitch.conf
hosts:      files myhostname resolve [!UNAVAIL=return] dns

ipconfig

...
Ethernet adapter vEthernet (WSL):

   Connection-specific DNS Suffix  . :
   Link-local IPv6 Address . . . . . : fe80::4919:5ddf:2eb5:5e8f%45
   IPv4 Address. . . . . . . . . . . : 172.22.160.1
   Subnet Mask . . . . . . . . . . . : 255.255.240.0
   Default Gateway . . . . . . . . . :
...

So the IP address I have in there matches my resolv.conf defaults

xasx commented 3 years ago

@andyneff thank you for sharing.

I just set up a fresh Ubuntu and installed wsl-vpn into it. But as soon as I connect to VPN, I lose DNS in every WSL2 distro.

General network access works though:

64 bytes from fra16s52-in-f3.1e100.net (142.250.185.195): icmp_seq=30 ttl=38 time=18.4 ms
64 bytes from fra16s52-in-f3.1e100.net (142.250.185.195): icmp_seq=31 ttl=38 time=108 ms
64 bytes from fra16s52-in-f3.1e100.net (142.250.185.195): icmp_seq=32 ttl=38 time=110 ms
64 bytes from fra16s52-in-f3.1e100.net (142.250.185.195): icmp_seq=33 ttl=38 time=56.6 ms
64 bytes from fra16s52-in-f3.1e100.net (142.250.185.195): icmp_seq=34 ttl=38 time=19.4 ms
64 bytes from 142.250.185.195: icmp_seq=40 ttl=38 time=616 ms
64 bytes from 142.250.185.195: icmp_seq=41 ttl=38 time=25.4 ms
64 bytes from 142.250.185.195: icmp_seq=42 ttl=38 time=21.8 ms
64 bytes from 142.250.185.195: icmp_seq=43 ttl=38 time=30.2 ms

(guess when VPN has been connected)

andyneff commented 3 years ago

Some Antivirus firewalls are really bad at blocking DNS, especially with VPN in the mix.

Ping working is actually a pretty good sign! Since network traffic is working, can you verify DNS (UDP port 53) is even being allowed out?

The command: nslookup www.google.com 8.8.8.8 will bypass any resolv.conf setting, and query 8.8.8.8 (or any other DNS server IP you try) directly for a result. nslookup can be installed via the bind-utils package

[root@kaku andy]# nslookup www.google.com 8.8.8.8
Server:         8.8.8.8
Address:        8.8.8.8#53

Non-authoritative answer:
Name:   www.google.com
Address: 142.250.65.196
Name:   www.google.com
Address: 2607:f8b0:4006:806::2004
xasx commented 3 years ago

@andyneff works:

Server:         8.8.8.8
Address:        8.8.8.8#53

Non-authoritative answer:
Name:   www.google.com
Address: 142.250.185.228
Name:   www.google.com
Address: 2a00:1450:4001:813::2004

with VPN connected.

andyneff commented 3 years ago

Am I correct in understanding that your Debian running wsl-vpn is now on WSL2? Does Debain (WSL2) have the same resolv.conf as fedoraremix (WSL2) now?

xasx commented 3 years ago

yes, all WSL2s have the same resolv.conf.

andyneff commented 3 years ago

Does this fail in fedoraremix?

nslookup www.google.com 192.168.208.1

Where 192.168.208.1 is whatever the IP is in resolv.conf?

andyneff commented 3 years ago

Also, what do you get when you run:

grep ^hosts: /etc/nsswitch.conf
xasx commented 3 years ago
root in ~
❯ nslookup www.google.com 172.29.224.1
Server:         172.29.224.1
Address:        172.29.224.1#53

Non-authoritative answer:
Name:   www.google.com
Address: 142.250.184.196
Name:   www.google.com
Address: 2a00:1450:4001:831::2004

### after vpn connected

root in ~
❯ nslookup www.google.com 172.29.224.1
;; connection timed out; no servers could be reached

### refs

root in ~ took 15s
❯ grep ^hosts: /etc/nsswitch.conf

hosts:      files myhostname resolve [!UNAVAIL=return] dns

root in ~
❯ cat /etc/resolv.conf
# This file was automatically generated by WSL. To stop automatic generation of this file, add the following entry to /etc/wsl.conf:
# [network]
# generateResolvConf = false
nameserver 172.29.224.1
andyneff commented 3 years ago

@AmmarRahman Can you think of any reason why 8.8.8.8 would work while on VPN, but the original DNS server (172.29.224.1 in this case) would not work while on VPN?

andyneff commented 3 years ago

@xasx I'm running out of good ideas. May I ask what VPN client are you using?

One last thing to double check is right, while wsl-vpn is running on VPN, ip r should say:

# ip r
default via 192.168.67.1 dev eth1
192.168.67.0/24 dev eth1 proto kernel scope link src 192.168.67.3

I really don't think it's a firewall issue, as the Debian wouldn't be working either.

If editing your /etc/resolv.conf files and adding one of your actual DNS servers does work, you can try to go that route, but this is less than ideal.

@sakai135 Have you seen anything like this on your wsl-vpnkit?

xasx commented 3 years ago

I really don't think it's a firewall issue, as the Debian wouldn't be working either.

Just to be clear on that. It is not working under Debian with the exact same behavior.

VPN is Cisco, its installation is managed by the IT service through MS Endpoint Manager. I had no chance setting up the Store version to connect. McAfee Endpoint Security is active, as well as Windows firewall.

Cisco in this constellation works by enabling and disabling a network adapter, not sure how that plays in.

Is it most likely the firewall(s) then?

I'll try editing resolv.conf tomorrow.

root in ~
❯ ip r
default via 192.168.67.1 dev eth1
192.168.67.0/24 dev eth1 proto kernel scope link src 192.168.67.3
andyneff commented 3 years ago

Ah, good to know that Debain (now that it is in WSL2 too) is failing too. I think it is highly likely it is the firewall. We had McAfee Endpoint previously and it will 100% block the the DNS port in WSL2, in my experience. McAfee's official stance is "We don't support WSL2" and ends the statement there, as if that were an acceptable resolution.

I'm actually surprised 8.8.8.8 worked, I thought I remembered that not working.

I never figured out how to make a rule for McAfee, so I can't help you there. We had to "disable" McAfee Firewall (which doesn't actually disable your firewall, instead it returns the firewall to the default built-in Windows Defender) and that is how we got DNS working on WSL2 on VPN

andyneff commented 3 years ago

If your IT allows it, the easiest way to prove it is the firewall, is to disable it. McAfee is a little tricky on that subject, when you disable the firewall, it reenabled at random in the next 1-15? minutes, and you don't know when it happens. So it's best to:

  1. Disable McAfee. Close the McAfee window (this is important to get a proper update)
  2. Do test
  3. Open up the McAfee window again to check to see if McAfee is still disabled?
  4. If 3 was enabled already, repeat until you know you truly did a test with McAfee Disabled ☹

Additional Info

Sonicwall has two completely separate VPN Clients:

According to my notes:

  WSL 1 WSL 2
Mobile connect Does not work Works (only without McAfee)
NetExtender Works Works (only with wsl-vpn)

These notes seem to suggest that wsl-vpn got around the McAfee problem, but I don't have that clearly documented anywhere (so it's possible I still have to disable McAfee to get it working), and I no long have McAfee to test against. My point is this could also vary from VPN to VPN client, so if this Cisco client in the windows store is another variant of the Cisco software, it is worth a try.

AmmarRahman commented 3 years ago

Sorry I just came back from holiday. Seems like a DNS resolution issue. The debugging steps I would take would be:

  1. check the following files as per fedoraMix Readme The following configuration files have custom settings applied for the WSL environment: /etc/wsl.conf, /etc/local.conf, /etc/profile.
  2. Check if the connection is working from windows cmd
  3. Check if it is connecting from a docker container.

In theory, McCafee should be treating both Docker and wsl-vpn with the same disrespect.

sakai135 commented 3 years ago

@andyneff With the VPN config I have, all network traffic goes through the VPN with few exceptions. Requests to the WSL2 virtual adapter are routed through the VPN as well and fails because it is inaccessible from the VPN. 8.8.8.8 works because it is accessible through the VPN. @xasx might be dealing with a similar setup.

xasx commented 3 years ago

@sakai135 do you patch your resolv.conf then?

@andyneff I can't disable McAfee, unfortunately. Thank you for helping anyway.

Maybe I can find out some more cornerstones around this for insight later.

sakai135 commented 3 years ago

@xasx yes, I set generateResolvConf = false and use my script that edits resolv.conf.

andyneff commented 3 years ago

@xasx does nslookup www.google.com 192.168.67.1 work when on VPN?

xasx commented 3 years ago

@andyneff yes, it does - on and off VPN.

xasx commented 3 years ago

I'll try going with this IP address as the nameserver - should I?

andyneff commented 3 years ago

Great! Then it sounds like a course of action for you is to:

  1. Edit /etc/wsl.conf in every WSL2 image, to set generateResolvConf = false under the [network] section
  2. You need to restart wsl at this point for this to take immediate affect. The easiest way is to just restart all of it: wsl --shutdown which will shutdown all WSLs running.
  3. At this point, you can reopen any WSL windows that were killed, and it starts up those WSL distros
  4. Edit /etc/resolv.conf in ever WSL2 image, and add nameserver 192.168.67.1

Additional

If your company has "domain search paths", you can also add that to your resolv.conf. For example: if ping foobar usualy pings foobar.example.com, then you could add search example.com to your resolv.conf

We used a constant IP 192.168.67.1, so it should not need to be updated going forward.

Update: On step four, if /etc/resolv.conf is still a symlink, you probably want to unlink /etc/resolv.conf and then create a new file

xasx commented 3 years ago

@andyneff thanks for your detailed explanation and support. Highly appreciated 🥇 This is going to help even more people to work around the nasty issue.

andyneff commented 3 years ago

@xasx Great!

We used to edit resolv.conf for you, but I think our small sample set was showing us it wasn't needed, but apparently is it still needed for other scenarios, as you have shown ;)

@sakai135 Thank you! I forgot 192.168.67.1 was a DNS server too.

@AmmarRahman How would you like to patch this moving forward? Whether it's a firewall or VPN causing this is pretty immaterial. It seems like using 192.168.67.1 is sometimes quiet useful.

Some ideas:

  1. Just document is so people can do it manually
  2. Make a script that they call separately, and it would patch any WSL2's currently installed
  3. Make a flag for the install, that will patch all the WSL2s every time the services is started.
    • This would be sakai135's solution with a flag to make it optional
    • To handle "I just installed a new WSL2", document that the user should run wsl --shutdown, and that will help resync everything
  4. Same as 3, but use the profile file
    • As it could take 1 or a few second to check all the WSLs, this would probably be backgrounded so that you don't have a long delay every time you start a bash session
    • This would make it too easy to "start two bashes" in quick succession, so we would have to implement some locking just to make sure two bashes aren't editing the same WSL files at the same time
  5. Same as 3/4, but just always do it, not based on a flag
  6. Other ideas ;)

I'm partial to 3. 4 seems to add extra layers of complexity without a significant benefit.

AmmarRahman commented 3 years ago

@andyneff The first iteration I wrote of this script did have the DNS change as part of it. Some users did complain that it broke their set up when the VPN was off. I don't exactly remember what the issue was, but I found it more reliable without the DNS change at the time.

I'm thinking maybe option 1 can be added as a debugging solution