Closed fkorotkov closed 2 years ago
I was not able to ssh between two running VMs. @edigaryev, you have much better understanding of the network stack, could you please investigate it next week once you have M1 Pro on hands.
Also check for DHCP wraparound (incrementing IPs).
It seems that the traffic between VMs is indeed blocked (checked with tcpdump
on host and both of the VMs), most likely due to the usage of private
bridge interfaces:
% ifconfig bridge100
[...]
member: vmenet1 flags=10803<LEARNING,DISCOVER,PRIVATE,CSUM>
ifmaxaddr 0 port 27 priority 0 path cost 0
member: vmenet0 flags=10803<LEARNING,DISCOVER,PRIVATE,CSUM>
ifmaxaddr 0 port 25 priority 0 path cost 0
[...]
(note the PRIVATE
flag)
However, it's still possible to confuse the host with ARP spoofing and capture the traffic destined to another VM. The simplest to reproduce scenario is:
macos1
and macos2
macos2
, go to Settings
→ Network
and change the IP to that of macos1
2.1. This will result in macos2
sending an ARP announcement, which will update the host's ARP tablemacos1
VM IP will go to the macos2
VM
3.1. This might be unencrypted traffic, encrypted traffic with source IPs and other metadata and new SSH connection to to the agentMaybe we can deal with the SSH host identification from the host's side to avoid connecting to the wrong VM, but I'm not sure if we can easily rule out the other risks associated with this.
I think a more solid solution would be to provide additional an tooling that would manage the bridge interfaces for each VM by itself and use sticky
flag or insert an antispoof
PF rule.
Also, regarding the currently lacking "two VMs" possibility, there's an issue with ECID re-use, which is is documented in the Virtualization.Framework
documentation:
Running two VMs concurrently with the same identifier results in undefined behavior in the guest operating system.
Two VMs running concurrently shouldn’t use the same identifier.
(https://developer.apple.com/documentation/virtualization/vzmacmachineidentifier)
This can be worked cleanly by releasing a shadow VM for each VM that needs to be run as clones together, but that doesn't guarantee reproducibility, unfortunately.
Also check for DHCP wraparound (incrementing IPs).
I've tried starting a bunch of VMs sequentially from a Golang program and it seems that the internal Virtualization.Framework
DHCP-server simply wraps around to 192.168.64.2
after giving out 192.168.64.254
address.
However, after the wraparound happens the internal DHCP-server breaks somehow and VMs start to assign themselves addresses from the 169.254.0.0/16
range and not from 192.168.64.0/24
range.
internal DHCP-server
Turns out that this "internal DHCP-server" is actually a /usr/libexec/dhcp6d
(most likely) that writes to /private/var/db/dhcpd_leases
, here's an example lease from it:
{
name=adminsVlMachine
ip_address=192.168.64.30
hw_address=1,da:a4:b2:44:c3:3d
identifier=1,da:a4:b2:44:c3:3d
lease=0x6272f83d
}
Removing the file fixes the problem.
Another workaround is to re-use the MAC-addresses to avoid the allocation of new IP-addresses from the pool.
However, after the wraparound happens the internal DHCP-server breaks somehow and VMs start to assign themselves addresses from the
169.254.0.0/16
range and not from192.168.64.0/24
range.
I've researched a bit more on this topic and it turns out that this is only partially true. The DHCP-server used in macOS is actually bootpd(8)
and it's sources are available publicly.
The server can actually reclaim the expired IP-addresses from the pool by calling DHCPLeases_reclaim()
function, but it will only do so when there are no more free IPs available (https://github.com/apple-opensource/bootp/blob/bd11fdd07f7b9b581e32d2e19e653101cb42c540/bootpd.tproj/dhcpd.c#L798-L800) and the reclamation will succeed only if there are expired entries that can be evicted from the pool.
So, the issue I've seen in https://github.com/cirruslabs/tart/issues/20#issuecomment-1116732505 was due to the fact that by default the lease interval is a day (86400 seconds) long, but the whole experiment was done probably under an hour.
I think it makes sense to run the checks again and verify nothing has change in networking in Virtualization.Framework
from Xcode 14 Beta (Tart version 0.8.0+).
Seems by default Virtualization framework makes the network sharable. Let's see if we can isolate two VMs running simultaneously so one can't ssh into another.