utmapp / UTM

Virtual machines for iOS and macOS
https://getutm.app
Apache License 2.0
26.67k stars 1.34k forks source link

Automatic DNS doesn't work in the guest #2353

Open conath opened 3 years ago

conath commented 3 years ago

In all of my UTM VMs, the DNS server that is automatically used by the guest doesn't work. There is no DNS response, meaning many things break. Most notably this includes the Windows 10 OOBE (first time setup). I have to manually specify the DNS settings to point to either my local DNS server or an external one like 1.1.1.1 to get my VMs "online".

For example, using a macOS Leopard VM, the automatic DNS server determined by DHCP is 10.0.2.3. However, a request of dig github.com @10.0.2.3 (to make sure it doesn't use another server) times out. Requesting the DNS lookup from 1.1.1.1 instead works just fine.

Screenshot of OS X Leopard Terminal. The program DIG is run twice, one time with the local DNS server 10.0.2.3 and one time with the global DNS server 1.1.1.1. The first request times out and the second one works fine.

Configuration

No example VM is provided with this issue because literally any of the VMs I have used have had this problem. Tested configurations include:

The common variable? My home network setup.

My IPv4 settings

subnet 192.168.0.0/24 router 192.168.0.1 dns 192.168.0.54 broadcast 192.168.0.255

Notably, the router and DNS server are not the same. I have set up my router to automatically inform all network clients of my DNS server's IP and it works just fine with every device I own. However it doesn't work for UTM, apparently.

Is there a manual option I need to use to make automatic DNS settings work with my network setup?

osy commented 3 years ago

So the slirp library emulates a NIC which creates a VLAN on 10.0.2.0/24. When it sees a TCP connection to 10.0.2.3, it uses libresolv to get a list of DNS from the system directly (on your host network). So in your case it would find 192.168.0.54.

So when the guest opens a socket to 10.0.2.3 port 53, QEMU (using slirp) opens a socket to 192.168.0.54 port 53 and proxies the traffic between them. This works fine for me but I don't know why it's not working for you (and others as I've heard from many people). If you have Wireshark or tcpdump, I would be curious if you can see if the DNS packets make it out of the guest at all. Alternatively, your DNS host on the router is rejecting the packets because it somehow thinks (knows) that the IP was spoofed (although I don't think DNS protocol would know considering layer 3 and below are all "correct").

EDIT: Another possibility is that libresolv isn't working for you for some reason. Try stealing the following code https://github.com/utmapp/libslirp/blob/5ac17660a76e321c37e6dca2e3c04d9d7d2b7ff4/src/slirp.c#L134-L186 into a test project and link with -lresolv.

conath commented 3 years ago

Sorry, I have built the libslirp you linked to and I don't know how to proceed. Do I need to build QEMU from source with this variant of libslirp? (And how would I do that?)

Thanks.

osy commented 3 years ago

You don’t, just copy paste the code I linked.

HenkPoley commented 3 years ago

The Qemu VLAN (always) has a DNS server at 10.0.2.3?

Or is the host system at 10.0.2.3, once you configure port mapping? (e.g. there is no direct internet access)

osy commented 3 years ago

QEMU slirp dns redirection is defaulted to 10.0.2.3 for the vm’s vlan.

HenkPoley commented 3 years ago

Do I need to change anything in UTM from the default to 'do' slirp ?

nslookup -type=NS example.com 10.0.2.3
;; connection timed out; no servers could be reached

Just tried the direct download of Debian 10.4 from: https://mac.getutm.app/gallery/debian-10-4-xfce

Is it supposed to come not configured for internet access? I mean, should I expect sudo apt update to not be able to access deb.debian.org ?

I notice under VM settings > Network > Show Advanced Settings > DNS Server (and DHCP start) to show as '10.0.2.0.15' (5 dotted decimal?) by default (as example in grey).

osy commented 3 years ago

You should use a dns like 1.1.1.1 or 8.8.4.4. If you need additional help please post in https://github.com/utmapp/UTM/discussions as this is not the right place.

conath commented 3 years ago

It is worth noting that with the upcoming 2.2.0 update, one can switch to Shared or Bridged networking to bypass this issue entirely and get working DNS. Emulated network might still show this issue.

osy commented 3 years ago

@conath can you test the latest commit and see if it resolved the issue?

conath commented 3 years ago

Does not seem to have improved things (using "emulated VLAN" network mode.

Screen Shot 2021-09-29 at 23 31 17

Screen Shot 2021-09-29 at 23 38 33

osy commented 3 years ago

Can you confirm you’re using the sysroot from https://github.com/utmapp/UTM/actions/runs/1276161519 ?

conath commented 3 years ago

I am using the Xcode archive from that run, so I believe the answer is yes.

osy commented 3 years ago

I don’t remember if I’ve asked but has anyone tested with https://gist.github.com/akihikodaki/87df4149e7ca87f18dc56807ec5a1bc5 or raw QEMU command line and if that has the same issue?

lingdocs commented 2 years ago

I'm also having an issue using UTM 2.4.1 with Ubuntu 20.04. I followed the guide in the gallery and everything worked perfectly at first. The issue is that if I want to use port forwarding to allow SSHing into the guest, and to enable the port forwarding option I need to change the network mode to "Emulated VLAN." When I switch to Emulated VLAN the DNS stops working. I thought I could just set my DNS to use 1.1.1.1 but I've also been unable to set the DNS manually either by the GUI (because no 'wired network' shows up) or by using netplan as per all the guides online.

conath commented 2 years ago

I'm also having an issue using UTM 2.4.1 with Ubuntu 20.04. I followed the guide in the gallery and everything worked perfectly at first. The issue is that if I want to use port forwarding to allow SSHing into the guest, and to enable the port forwarding option I need to change the network mode to "Emulated VLAN." When I switch to Emulated VLAN the DNS stops working. I thought I could just set my DNS to use 1.1.1.1 but I've also been unable to set the DNS manually either by the GUI (because no 'wired network' shows up) or by using netplan as per all the guides online.

Changing it with netplan worked for me, but I had to add the ethernet to the config file myself. Guide I used. You can alternatively change the network to bridged mode and then SSH into the VM using its own IP.

lingdocs commented 2 years ago

Thanks @conath, that helped. I used that guide but I had to create my own 01-network-manager.yaml file as it wasn't there and then I had go into the GUI and fiddle around by disabling the automatic DNS switch in the IPv4 tab and entering the DNS servers there, then do sudo netplan apply then it started working for me.

PatTheMav commented 2 years ago

@conath I see this issue is still open - I can report that the issue as reported in the OP happens on my iMac with vanilla QEMU as well as Dosbox-Staging (which has added slirp-based networking recently).

As explained, changing the DNS IP directly to my actual router's IP (or 8.8.8.8/1.1.1.1 respectively) enables DNS functionality without issue. It is only the "virtual" DNS by slip (10.0.2.3) which doesn't work.

Alas I haven't been able to isolate the cause of the issue yet.

osy commented 2 years ago

I think it would really help to get some pcap from the guest as well as the host filtered on port 53 to see what's happening to the packets. I am unable to reproduce it on my end so I have no idea where it's getting dropped. Right now my best guess is some routers are configured to drop DNS packets it thinks are faked because it's outside the VLAN of the router.

PatTheMav commented 2 years ago

@osy good idea - FWIW I've just ran a packet capture on the macOS host and with the default slirp DNS set, there were no packets on UDP port 53. Once I changed the IP to my local DNS, packets were captured as expected.

Gabriella439 commented 2 years ago

@osy: I was able to reliably reproduce this by following the instructions from here:

https://github.com/YorikSar/nixos-vm-on-macos

That spins up a NixOS guest on a macOS host which exhibits the same DNS-related failures

osy commented 1 year ago

Wondering if this is still reproducing with the latest version due to many iterations of QEMU updates.

HenkPoley commented 1 year ago

It is not a bug that gets accidentally fixed.

I remember it seemed to be looking up IPv6 addresses (128bit = 16 bytes), with a 4 byte offset (e.g. IPv4, 32 bits). Something with the IPv6 address of the default gateway, or DNS server.

I just didn't find where the bug happened.

People recommended to just choose a network-switch emulator (Shared or Bridged networking) that does no use the slip library. E.g. not one that has the bug. It must be somewhere discussed in these issues.

Edit: Ah, I see this ticket is mostly about IPv4. It could still be doing an off-by-one lookup (4 bytes off, similar to the 4 bytes shift I saw with IPv6 #2429).

I can only recommend to dive into the data structure when it's actually operating. I remember I did print debugging of the data it fetched (sadly not the data structure).

basilevs commented 11 months ago

Reproducible on UTM 4.4.4 (93) and host MacOS Monterey in Shared network mode. The problem is intermittent - if you restart the VM enough times, eventually you get automatic DNS in guest OS.

michaelvanstraten commented 10 months ago

This might be a bit random, but I am stumbling upon the same issue trying to build an alpine image using packer with QEMU as the build backend.