lima-vm / lima

Linux virtual machines, with a focus on running containers
https://lima-vm.io/
Apache License 2.0
15.39k stars 604 forks source link

shared network mode not working on Mac M1 #1259

Open annemirasol opened 1 year ago

annemirasol commented 1 year ago

Note (by @AkihiroSuda )

The following commands are reported to fix the issue on some machines:

sudo /usr/libexec/ApplicationFirewall/socketfilterfw --add /usr/libexec/bootpd
sudo /usr/libexec/ApplicationFirewall/socketfilterfw --unblock /usr/libexec/bootpd

Description

Environment

macOS Monterey version 12.6.2 Apple M1 Pro chip limactl version 0.14.1

What I expected

Guest IP address of VM created with limactl start --name=default template://vmnet should be accessible (i.e. can be pinged) from host.

What actually happened

No accessible IP address.

❯ limactl list
NAME       STATUS     SSH                VMTYPE    ARCH       CPUS    MEMORY    DISK      DIR
default    Running    127.0.0.1:60022    qemu      aarch64    4       4GiB      100GiB    ~/.lima/default
❯ lima
am@lima-default:/Users/am$ sudo apt install net-tools
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
net-tools is already the newest version (1.60+git20181103.0eebece-1ubuntu5).
0 upgraded, 0 newly installed, 0 to remove and 18 not upgraded.
am@lima-default:/Users/am$ ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.5.15  netmask 255.255.255.0  broadcast 192.168.5.255
        inet6 fec0::5055:55ff:fe72:7117  prefixlen 64  scopeid 0x40<site>
        inet6 fe80::5055:55ff:fe72:7117  prefixlen 64  scopeid 0x20<link>
        ether 52:55:55:72:71:17  txqueuelen 1000  (Ethernet)
        RX packets 16673  bytes 23072545 (23.0 MB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 1307  bytes 146839 (146.8 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lima0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fdc4:fde4:aceb:14bd:5055:55ff:fee7:9ac4  prefixlen 64  scopeid 0x0<global>
        inet6 fe80::5055:55ff:fee7:9ac4  prefixlen 64  scopeid 0x20<link>
        ether 52:55:55:e7:9a:c4  txqueuelen 1000  (Ethernet)
        RX packets 38  bytes 5883 (5.8 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 65  bytes 10104 (10.1 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 174  bytes 15527 (15.5 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 174  bytes 15527 (15.5 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

Some notes

AkihiroSuda commented 1 year ago

DHCP daemon might not be working? (sudo /usr/libexec/bootpd on the host)

annemirasol commented 1 year ago

Great lead, thank you so much!

Running sudo /usr/libexec/bootpd, then limactl stop default && limactl delete default && limactl start --name=default template://vmnet, still gives me no IP address. (I also tried running bootpd with the Enable DHCP service flag -D.)

With the bootpd lead, I found this issue for a different software (https://github.com/canonical/multipass/issues/2387), and the suggested workaround was to disable the firewall. It's not entirely ideal, but it also works for this case. I'm still trying to figure out how to make the allowlist work, but for now, it looks like the firewall needs to be off when starting the instance.

Please feel free to close this issue, as I'm not sure if anything can be done about this on lima's side.

annemirasol commented 1 year ago

Some details, in case helpful:

With firewall off:

~
❯ ps ax | grep bootpd
 8852 s000  S+     0:00.00 grep --color=auto --exclude-dir=.bzr --exclude-dir=CVS --exclude-dir=.git --exclude-dir=.hg --exclude-dir=.svn --exclude-dir=.idea --exclude-dir=.tox bootpd
~
❯ limactl stop -f default && limactl delete -f default && limactl start --name=default template://vmnet
...
~
❯ ps ax | grep bootpd
 8899   ??  Ss     0:00.03 /usr/libexec/bootpd
 8943 s000  S+     0:00.00 grep --color=auto --exclude-dir=.bzr --exclude-dir=CVS --exclude-dir=.git --exclude-dir=.hg --exclude-dir=.svn --exclude-dir=.idea --exclude-dir=.tox bootpd

With firewall on:

~ 
❯ ps ax | grep bootpd
 9055 s000  R+     0:00.00 grep --color=auto --exclude-dir=.bzr --exclude-dir=CVS --exclude-dir=.git --exclude-dir=.hg --exclude-dir=.svn --exclude-dir=.idea --exclude-dir=.tox bootpd
~
❯ limactl stop -f default && limactl delete -f default && limactl start --name=default template://vmnet
...
~
❯ ps ax | grep bootpd
 9202 s000  S+     0:00.00 grep --color=auto --exclude-dir=.bzr --exclude-dir=CVS --exclude-dir=.git --exclude-dir=.hg --exclude-dir=.svn --exclude-dir=.idea --exclude-dir=.tox bootpd

With firewall on, bootpd somehow gets killed.

I ran this same process on an Intel Mac, on macOS Ventura 13.1. bootpd stays alive whether or not firewall is on.

AkihiroSuda commented 1 year ago

Apple's firewall or a thirdparty firewall product?

The shared network works for me with Apple's firewall on Intel macOS 13

annemirasol commented 1 year ago

Apple's firewall or a thirdparty firewall product?

Apple's built-in firewall.

The shared network works for me with Apple's firewall on Intel macOS 13

Same here, no problems on my other machine that is an Intel macOS 13.

pecigonzalo commented 1 year ago

FWIW, I do believe the problem lies in bootp as I see the packets in tcpdump but no reply.

17:43:40.833473 IP 0.0.0.0.bootpc > broadcasthost.bootps: BOOTP/DHCP, Request from 52:55:55:a5:c0:bb (oui Unknown), length 300

I then tried isc-dhcp with sudo /opt/homebrew/opt/isc-dhcp/sbin/dhcpd -f bridge100 and a config like

❯ cat /opt/homebrew/etc/dhcpd.conf
# dhcpd.conf
#

default-lease-time 600;
max-lease-time 7200;

# Use this to enble / disable dynamic dns updates globally.
#ddns-update-style none;

# If this DHCP server is the official DHCP server for the local
# network, the authoritative directive should be uncommented.
authoritative;

# Use this to send dhcp log messages to a different log file (you also
# have to hack syslog.conf to complete the redirection).
#log-facility local7;

# My First Subnet
subnet 192.168.106.0 netmask 255.255.255.0 {
  range 192.168.106.10 192.168.106.200;
  option domain-name-servers 1.1.1.1, 1.0.0.1;
  option routers 192.168.106.1;
  option broadcast-address 192.168.106.255;

  # Let's let folks keep their IP's for a while
  default-lease-time 6000;
  max-lease-time 72000;
}

Which successfully gave me an IP on the interface. So while we do have the error like this one (which might be related), I believe the core issue is bootp not working, either due to firewall or something else. Unfortunately, I could not find any logs for it.

pecigonzalo commented 1 year ago

I ran multiple tests on my macOS (disable firewall, disable this, disable that, etc) and I believe bootp is at fault here. Even running it manually it shows nothing, no log, no anything. I suspect its just not working as a DHCP. If anyone knows how to validate it does work, that would be great.

aelsnz commented 1 year ago

I just tested same in my setup and agree, it looks like bootp is it fault. If I create a seperate DHCP server similar to what you did above with isc-dhcp I get the same, a DHCP address is obtained for my underlying QEMU setup.

I was getting this in Colima which seem to be related to this issue discussed here:

FATA[0071] error starting kubernetes: error running [lima kubectl cluster-info], output: "The connection to the server localhost:8080 was refused - did you specify the right host or port?", err: "exit status 1"

The issue is that one network interface does not get a DHCP Address.

I updated to latest Mac OS 13.3 and started getting this issue. Disabled ipv6, which temporary worked, but updated to 13.3.1 and back again not working. Which made me dig into it a bit more. Spend time looking at socket_vmnet and all seem fine, but a DHCP address does not get assigned.

Steps followed:

Step 1: Start colima: colima -p abc start -k --network-address -c4 -m6 -d10 It fails again with same error

FATA[0071] error starting kubernetes: error running [lima kubectl cluster-info], output: "The connection to the server localhost:8080 was refused - did you specify the right host or port?", err: "exit status 1"

Step 2: Stop the following:

/bin/launchctl unload -w /System/Library/LaunchDaemons/bootps.plist
/bin/launchctl unload -w /System/Library/LaunchDaemons/com.apple.dhcp6d.plist

Step 3: Start isc-dhcp service for bridge100, I used "-d" flag to get more output. Note - install with brew install isc-dhcp and create dhcp config similar to earlier mentioned in post.

sudo /opt/homebrew/opt/isc-dhcp/sbin/dhcpd -d -f bridge100

Step 4: SSH into colima host an obtain address for col0

$ colima ssh
colima:/Users/aelsnz$ sudo su -
colima:~# ifconfig col0
col0      Link encap:Ethernet  HWaddr 52:55:55:A2:EA:9D
          inet6 addr: fde2:fcb5:a448:d160:5055:55ff:fea2:ea9d/64 Scope:Global
          inet6 addr: fe80::5055:55ff:fea2:ea9d/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:17 errors:0 dropped:0 overruns:0 frame:0
          TX packets:39 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:2646 (2.5 KiB)  TX bytes:9754 (9.5 KiB)

colima:~# udhcpc -i col0
udhcpc: started, v1.35.0
udhcpc: broadcasting discover
udhcpc: broadcasting select for 192.168.106.3, server 192.168.106.1
udhcpc: lease of 192.168.106.3 obtained from 192.168.106.1, lease time 5540

IP Address now gets assigned.

I can also now create new colima environments and they work out of box while the isc-dhcp service is running.

Does anyone know how we can troubleshoot or figure out why bootp is not providing DHCP to bridge100 Is ipv6 maybe an issue here - as said, seen setting ipv6 to Link-local Only does help resolve after reboot.

aelsnz commented 1 year ago

Additional update:

When I adjust the Mac network ipv6 settings (I am using a LAN connection - not wifi) - and set under TCP/IP the "Configure IPv6" to "Link-Local Only and reboot (in my case, had to reboot twice which I guess does not make sense) - but then it seem as if the bootpd process starts, previously I just got launchd (one process).

Not sure if anyone knows how we can debug or make sure bootpd start properly for ipv4 DHCP?

below we can see what is using dhcp port 67

sudo lsof -nP -i4UDP:67

COMMAND   PID USER   FD   TYPE             DEVICE SIZE/OFF NODE NAME
launchd     1 root   46u  IPv4 0x9b76825c118326e7      0t0  UDP *:67
launchd     1 root   48u  IPv4 0x9b76825c118326e7      0t0  UDP *:67
launchd     1 root   49u  IPv4 0x9b76825c118326e7      0t0  UDP *:67
launchd     1 root   50u  IPv4 0x9b76825c118326e7      0t0  UDP *:67
bootpd  11431 root    0u  IPv4 0x9b76825c118326e7      0t0  UDP *:67
bootpd  11431 root    1u  IPv4 0x9b76825c118326e7      0t0  UDP *:67
bootpd  11431 root    2u  IPv4 0x9b76825c118326e7      0t0  UDP *:67

I did try a few times with something like this to manually try below before the changes to ipv6, but it was as if it starts, but nothing happens. Used below command (did unload /System/Library/LaunchDaemons/bootps.plist prior to this)

sudo /usr/libexec/bootpd -D -d 

even tried:

sudo /usr/libexec/bootpd -D -d -i bridge100

I know of a few people now that has updated to latest Mac OS 13.x and seeing the issues as discussed here.

To add you can use this to see what happens in bootpd and when things work, you can see the DHCP requests, otherwise there is nothing...

sudo log stream --process bootpd --info --debug

some working messages when I start an environment that requests dhcp via socket_vmnet / bridge100

$ sudo log stream --process bootpd --info --debug
...
...
...
2023-04-15 10:07:23.040348+1200 0x22899    Info        0x0                  13018  0    bootpd: interface <private>: ip <private> mask <private>
2023-04-15 10:07:23.040371+1200 0x22899    Info        0x0                  13018  0    bootpd: interface <private>: ip <private> mask <private>
2023-04-15 10:07:23.040837+1200 0x22899    Info        0x0                  13018  0    bootpd: use_open_directory is FALSE
2023-04-15 10:07:23.041370+1200 0x22899    Info        0x0                  13018  0    bootpd: DHCP DISCOVER [<private>]: <private> <<private>>
2023-04-15 10:07:23.041400+1200 0x22899    Debug       0x0                  13018  0    bootpd: default domain name added
2023-04-15 10:07:23.041421+1200 0x22899    Debug       0x0                  13018  0    bootpd: replying to <private>
2023-04-15 10:07:23.041747+1200 0x22899    Info        0x0                  13018  0    bootpd: OFFER sent <private> <private> pktsize 300
2023-04-15 10:07:23.077748+1200 0x2289a    Info        0x0                  13018  0    bootpd: DHCP REQUEST [<private>]: <private> <<private>>
2023-04-15 10:07:23.078603+1200 0x2289a    Debug       0x0                  13018  0    bootpd: default domain name added
2023-04-15 10:07:23.078634+1200 0x2289a    Debug       0x0                  13018  0    bootpd: replying to <private>
2023-04-15 10:07:23.078864+1200 0x2289a    Info        0x0                  13018  0    bootpd: ACK sent <private> <private> pktsize 300
2023-04-15 10:07:25.353654+1200 0x2289a    Info        0x0                  13018  0    bootpd: DHCP RELEASE [<private>]: <private>
2023-04-15 10:07:28.467289+1200 0x2289a    Info        0x0                  13018  0    bootpd: DHCP DISCOVER [<private>]: <private> <<private>>
2023-04-15 10:07:28.467444+1200 0x2289a    Debug       0x0                  13018  0    bootpd: default domain name added
2023-04-15 10:07:28.467498+1200 0x2289a    Debug       0x0                  13018  0    bootpd: replying to <private>
2023-04-15 10:07:28.470553+1200 0x2289a    Info        0x0                  13018  0    bootpd: OFFER sent <private> <private> pktsize 300
2023-04-15 10:07:28.568087+1200 0x2289a    Info        0x0                  13018  0    bootpd: DHCP REQUEST [<private>]: <private> <<private>>
2023-04-15 10:07:28.579613+1200 0x2289a    Debug       0x0                  13018  0    bootpd: default domain name added
2023-04-15 10:07:28.579838+1200 0x2289a    Debug       0x0                  13018  0    bootpd: replying to <private>
2023-04-15 10:07:28.580281+1200 0x2289a    Info        0x0                  13018  0    bootpd: ACK sent <private> <private> pktsize 300
AravindGopala commented 1 year ago

@aelsnz thank you for the debug commands, did you find any permanent resolution for this, your steps did help, I did set the IPV6 to link locally and it seems to work after a reboot.

aelsnz commented 1 year ago

@AravindGopala unfortunately the issue seem that bootpd does not provide the DHCP address.
This might also relate to - https://github.com/canonical/multipass/issues/2387

As work-around in colima (latest 0.5.5) you can now pass in an environment variable COLIMA_IP and set a fixed IP in the 192.168.106.0/24 subnet - (ideally use above 200) - it is at least an option to get past this if the change on ipv6 does not work for you - I know for some this does not work, and the only option then is to use this environment variable to get a fixed IP. Hope this helps.

colima start -c 1 -d 10 -m 2 --network-address --env COLIMA_IP=192.168.106.201

INFO[0000] starting colima
INFO[0000] runtime: docker
INFO[0000] preparing network ...                         context=vm
INFO[0001] creating and starting ...                     context=vm
INFO[0032] provisioning ...                              context=docker
INFO[0032] starting ...                                  context=docker
INFO[0038] done

$ colima list
PROFILE    STATUS     ARCH       CPUS    MEMORY    DISK     RUNTIME    ADDRESS
default    Running    aarch64    1       2GiB      10GiB    docker     192.168.106.201
AravindGopala commented 1 year ago

@aelsnz Finally this worked for me, I have to run the below command every time I boot once, looks like bootp is being blocked by firewall.

sudo /usr/libexec/ApplicationFirewall/socketfilterfw --unblock /usr/libexec/bootpd
akudiyar commented 1 year ago

For the others who see the above command failing with The application is not part of the firewall, execute first

sudo /usr/libexec/ApplicationFirewall/socketfilterfw --add /usr/libexec/bootpd
mprimeaux commented 1 year ago

After a reboot, I run the following:

sudo /usr/libexec/ApplicationFirewall/socketfilterfw --add /usr/libexec/bootpd
sudo /usr/libexec/ApplicationFirewall/socketfilterfw --unblock /usr/libexec/bootpd

Please refer to socket_vmnet issue 18.

AkihiroSuda commented 1 year ago

Is this still an issue with macOS 14?

mprimeaux commented 1 year ago

Seems the answer is yes.

mprimeaux@lima-default:/Users/mprimeaux/source$ ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.5.15  netmask 255.255.255.0  broadcast 192.168.5.255
        inet6 fec0::5055:55ff:fef9:d374  prefixlen 64  scopeid 0x40<site>
        inet6 fe80::5055:55ff:fef9:d374  prefixlen 64  scopeid 0x20<link>
        ether 52:55:55:f9:d3:74  txqueuelen 1000  (Ethernet)
        RX packets 3723886  bytes 4536204731 (4.5 GB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 730753  bytes 426101191 (426.1 MB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 947  bytes 82721 (82.7 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 947  bytes 82721 (82.7 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

nerdctl0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        inet 10.4.0.1  netmask 255.255.255.0  broadcast 10.4.0.255
        inet6 fe80::8865:18ff:fe2d:e4d5  prefixlen 64  scopeid 0x20<link>
        ether 8a:65:18:2d:e4:d5  txqueuelen 1000  (Ethernet)
        RX packets 19  bytes 1344 (1.3 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 16  bytes 1680 (1.6 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

mprimeaux@lima-default:/Users/mprimeaux/source$ exit
logout

❯ ping 192.168.5.15
PING 192.168.5.15 (192.168.5.15): 56 data bytes
Request timeout for icmp_seq 0
Request timeout for icmp_seq 1
^C
--- 192.168.5.15 ping statistics ---
3 packets transmitted, 0 packets received, 100.0% packet loss

This is on macOS 14.0 on an M2 Ultra. I see this open issue is on the M1 but the fact I'm on an M2 is unlikely a factor. Turning the macOS firewall off has no effect on ping absorption and stealth mode is not enabled.

AkihiroSuda commented 1 year ago

Thanks

I see this open issue is on the M1 but the fact I'm on an M2 is unlikely a factor.

This issue might be still specific to ARM Mac? I can never reproduce issue, maybe because I’m using Intel?

AkihiroSuda commented 1 year ago

ping 192.168.5.15

This is invalid anyway. This IP is never reachable from the host.

Please make sure you have “lima:shared” network in the YAML. If it is working properly, you’ll get an IP like 192.168.105.2 associated with “lima0” interface.

mprimeaux commented 1 year ago

I can't really comment on this issue since I don't really have a need to ping the NAT'd address. I just access the loopback on a specific port if needed.

This issue might be still specific to ARM Mac? I can never reproduce issue, maybe because I’m using Intel?

Now as to this point, there most definitely are behavioral differences with Lima on Intel versus ARM. As an example, Lima freezes indefinitely at times when I build container images. The only way to recover is limactl stop --force && limactl rm default && limactl start --name=default --tty=false. This happens once or so every few days on my ARM machines but has never occurred on Intel.

mprimeaux commented 1 year ago

ping 192.168.5.15

This is invalid anyway. This IP is never reachable from the host.

Please make sure you have “lima:shared” network in the YAML. If it is working properly, you’ll get an IP like 192.168.105.2 associated with “lima0” interface.

Ah. Right. Good catch. Let me try that now.

mprimeaux commented 1 year ago

I started with limactl start --name=default template://vmnet and do see an address resembling 192.168.105.x.

mprimeaux@lima-default:/Users/mprimeaux/source$ ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.5.15  netmask 255.255.255.0  broadcast 192.168.5.255
        inet6 fec0::5055:55ff:fef9:d374  prefixlen 64  scopeid 0x40<site>
        inet6 fe80::5055:55ff:fef9:d374  prefixlen 64  scopeid 0x20<link>
        ether 52:55:55:f9:d3:74  txqueuelen 1000  (Ethernet)
        RX packets 20613  bytes 25957347 (25.9 MB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 3422  bytes 297873 (297.8 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lima0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.105.116  netmask 255.255.255.0  broadcast 192.168.105.255
        inet6 fd05:f750:89b5:cf4b:5055:55ff:fe35:e8c3  prefixlen 64  scopeid 0x0<global>
        inet6 fe80::5055:55ff:fe35:e8c3  prefixlen 64  scopeid 0x20<link>
        ether 52:55:55:35:e8:c3  txqueuelen 1000  (Ethernet)
        RX packets 84  bytes 12799 (12.7 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 92  bytes 8200 (8.2 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 236  bytes 22474 (22.4 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 236  bytes 22474 (22.4 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

...and I am able to ping it from my host machine...

❯ ping  192.168.105.116
PING 192.168.105.116 (192.168.105.116): 56 data bytes
64 bytes from 192.168.105.116: icmp_seq=0 ttl=64 time=0.282 ms
64 bytes from 192.168.105.116: icmp_seq=1 ttl=64 time=0.271 ms
64 bytes from 192.168.105.116: icmp_seq=2 ttl=64 time=0.247 ms
^C
--- 192.168.105.116 ping statistics ---
3 packets transmitted, 3 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.247/0.267/0.282/0.015 ms
AkihiroSuda commented 1 year ago

Thanks, so the issue seems resolved with the recent macOS? The socketfilterfw workaround is no longer needed?

mprimeaux commented 1 year ago

To test that hypothesis, I'll need to fully reboot and run a few tests. Let me try that next.

I also no longer have a machine with an M1 anymore; only M2 machines and an Intel-based Mac in my home lab so I'm unsure how conclusive the test results will be but it will at least provide a data point.

mprimeaux commented 1 year ago

Actually, I do have an M1 machine. I'll test on both.

mprimeaux commented 1 year ago

Unfortunately, the socketfilterfw workaround is still required.

😄  minikube v1.31.2 on Darwin 14.0 (arm64)
✨  Using the qemu2 driver based on user configuration
🌐  Automatically selected the socket_vmnet network
👍  Starting control plane node minikube in cluster minikube
🔥  Creating qemu2 VM (CPUs=6, Memory=32768MB, Disk=81920MB) ...
🔑  Your firewall is blocking bootpd which is required for socket_vmnet. The following commands will be executed to unblock bootpd:

    $ sudo /usr/libexec/ApplicationFirewall/socketfilterfw --add /usr/libexec/bootpd
    $ sudo /usr/libexec/ApplicationFirewall/socketfilterfw --unblock /usr/libexec/bootpd

Password:
🔄  Successfully unblocked bootpd process from firewall, retrying
🔥  Deleting "minikube" in qemu2 ...
🤦  StartHost failed, but will try again: creating host: create: creating: ip not found: failed to get IP address: could not find an IP address for a6:8:57:8b:65:a8
🔥  Creating qemu2 VM (CPUs=6, Memory=32768MB, Disk=81920MB) ...
📦  Preparing Kubernetes v1.27.4 on containerd 1.7.2 ...
    ▪ Generating certificates and keys ...
    ▪ Booting up control plane ...
    ▪ Configuring RBAC rules ...
🔗  Configuring bridge CNI (Container Networking Interface) ...
    ▪ Using image gcr.io/k8s-minikube/storage-provisioner:v5
🔎  Verifying Kubernetes components...
🌟  Enabled addons: default-storageclass, storage-provisioner
🏄  Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default
💡  metrics-server is an addon maintained by Kubernetes. For any concerns contact minikube on GitHub.
You can view the list of minikube maintainers at: https://github.com/kubernetes/minikube/blob/master/OWNERS
    ▪ Using image registry.k8s.io/metrics-server/metrics-server:v0.6.4
🌟  The 'metrics-server' addon is enabled
💡  registry is an addon maintained by minikube. For any concerns contact minikube on GitHub.
You can view the list of minikube maintainers at: https://github.com/kubernetes/minikube/blob/master/OWNERS
    ▪ Using image gcr.io/k8s-minikube/kube-registry-proxy:0.0.5
    ▪ Using image docker.io/registry:2.8.1
🔎  Verifying registry addon...
🌟  The 'registry' addon is enabled

Same failure on both the M1 and the M2 machines, which are running macOS 14.0 (23A344).

AkihiroSuda commented 1 year ago

This doesn't seem Lima though

mprimeaux commented 1 year ago

I was a bit overzealous :) My apologies.

For lima, specifically, I started with limactl start --name=default template://vmnet.

? Creating an instance "default" Proceed with the current configuration
INFO[0002] Starting socket_vmnet daemon for "shared" network
INFO[0002] QEMU binary "/opt/homebrew/bin/qemu-system-aarch64" seems properly signed with the "com.apple.security.hypervisor" entitlement
INFO[0002] Attempting to download the image              arch=aarch64 digest="sha256:af62ca6ba307388f7e0a8ad1c46103e6aea0130a09122e818df8d711637bf998" location="https://cloud-images.ubuntu.com/releases/23.04/release-20230810/ubuntu-23.04-server-cloudimg-arm64.img"
INFO[0002] Using cache "/Users/mprimeaux/Library/Caches/lima/download/by-url-sha256/5a75d2d43280fdaaa39f811921fa8e5906da4945667788f75460b6fc69dbf90d/data"
INFO[0003] Attempting to download the nerdctl archive    arch=aarch64 digest="sha256:32a2537e0a80e1493b5934ca56c3e237466606a1b720aef23b9c0a7fc3303bdb" location="https://github.com/containerd/nerdctl/releases/download/v1.5.0/nerdctl-full-1.5.0-linux-arm64.tar.gz"
INFO[0003] Using cache "/Users/mprimeaux/Library/Caches/lima/download/by-url-sha256/2e9505df478bbb8427823380c3ab4ef36836e7c2c9317f6c885d39be36546b19/data"
INFO[0004] [hostagent] Starting QEMU (hint: to watch the boot progress, see "/Users/mprimeaux/.lima/default/serial*.log")
INFO[0004] SSH Local Port: 60022
INFO[0004] [hostagent] Waiting for the essential requirement 1 of 5: "ssh"
INFO[0025] [hostagent] The essential requirement 1 of 5 is satisfied
INFO[0025] [hostagent] Waiting for the essential requirement 2 of 5: "user session is ready for ssh"
INFO[0025] [hostagent] The essential requirement 2 of 5 is satisfied
INFO[0025] [hostagent] Waiting for the essential requirement 3 of 5: "sshfs binary to be installed"
INFO[0034] [hostagent] The essential requirement 3 of 5 is satisfied
INFO[0034] [hostagent] Waiting for the essential requirement 4 of 5: "/etc/fuse.conf (/etc/fuse3.conf) to contain \"user_allow_other\""
INFO[0037] [hostagent] The essential requirement 4 of 5 is satisfied
INFO[0037] [hostagent] Waiting for the essential requirement 5 of 5: "the guest agent to be running"
INFO[0038] [hostagent] The essential requirement 5 of 5 is satisfied
INFO[0038] [hostagent] Mounting "/Users/mprimeaux" on "/Users/mprimeaux"
INFO[0038] [hostagent] Mounting "/tmp/lima" on "/tmp/lima"
INFO[0038] [hostagent] Waiting for the optional requirement 1 of 2: "systemd must be available"
INFO[0038] [hostagent] Forwarding "/run/lima-guestagent.sock" (guest) to "/Users/mprimeaux/.lima/default/ga.sock" (host)
INFO[0038] [hostagent] The optional requirement 1 of 2 is satisfied
INFO[0038] [hostagent] Waiting for the optional requirement 2 of 2: "containerd binaries to be installed"
INFO[0038] [hostagent] Not forwarding TCP 127.0.0.53:53
INFO[0038] [hostagent] Not forwarding TCP 127.0.0.54:53
INFO[0038] [hostagent] Not forwarding TCP [::]:22
INFO[0044] [hostagent] The optional requirement 2 of 2 is satisfied
INFO[0044] [hostagent] Waiting for the final requirement 1 of 1: "boot scripts must have finished"
INFO[0053] [hostagent] The final requirement 1 of 1 is satisfied
INFO[0053] READY. Run `lima` to open the shell.

However, I don't see an address resembling 192.168.105.x from ifconfig in the lima VM

mprimeaux@lima-default:/Users/mprimeaux$ ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.5.15  netmask 255.255.255.0  broadcast 192.168.5.255
        inet6 fe80::5055:55ff:fef9:d374  prefixlen 64  scopeid 0x20<link>
        inet6 fec0::5055:55ff:fef9:d374  prefixlen 64  scopeid 0x40<site>
        ether 52:55:55:f9:d3:74  txqueuelen 1000  (Ethernet)
        RX packets 20888  bytes 25972780 (25.9 MB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 3705  bytes 345293 (345.2 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lima0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fd05:f750:89b5:cf4b:5055:55ff:fe35:e8c3  prefixlen 64  scopeid 0x0<global>
        inet6 fe80::5055:55ff:fe35:e8c3  prefixlen 64  scopeid 0x20<link>
        ether 52:55:55:35:e8:c3  txqueuelen 1000  (Ethernet)
        RX packets 46  bytes 6940 (6.9 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 53  bytes 7421 (7.4 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 180  bytes 16116 (16.1 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 180  bytes 16116 (16.1 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

Same behavior on both the M1 and M2.

mprimeaux commented 1 year ago

...applying the socketfilterfw does resolve the issue:

mprimeaux@lima-default:/Users/mprimeaux/source/go-scriptures/platform$ ip address
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 52:55:55:f9:d3:74 brd ff:ff:ff:ff:ff:ff
    altname enp0s2
    inet 192.168.5.15/24 metric 100 brd 192.168.5.255 scope global dynamic eth0
       valid_lft 86357sec preferred_lft 86357sec
    inet6 fec0::5055:55ff:fef9:d374/64 scope site dynamic mngtmpaddr noprefixroute
       valid_lft 86359sec preferred_lft 14359sec
    inet6 fe80::5055:55ff:fef9:d374/64 scope link
       valid_lft forever preferred_lft forever
3: lima0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 52:55:55:35:e8:c3 brd ff:ff:ff:ff:ff:ff
    altname enp0s3
    inet 192.168.105.116/24 metric 100 brd 192.168.105.255 scope global dynamic lima0
       valid_lft 86357sec preferred_lft 86357sec
    inet6 fd05:f750:89b5:cf4b:5055:55ff:fe35:e8c3/64 scope global dynamic mngtmpaddr noprefixroute
       valid_lft 2591960sec preferred_lft 604760sec
    inet6 fe80::5055:55ff:fe35:e8c3/64 scope link
       valid_lft forever preferred_lft forever

Same behavior on both the M1 and M2. Basically, upon reboot, the workaround seems to still be required.

AkihiroSuda commented 1 year ago

Thank you, looks like socketfilterfw is still needed for you machine.

A weird thing is that the socketfilterfw command doesn't seem to work at all for my Intel Mac with macOS 14.0:

$ sudo /usr/libexec/ApplicationFirewall/socketfilterfw  --add /usr/libexec/bootpd 
The file path you specified does not exist

$ file /usr/libexec/bootpd 
/usr/libexec/bootpd: Mach-O universal binary with 2 architectures: [x86_64:Mach-O 64-bit executable x86_64] [arm64e:Mach-O 64-bit executable arm64e]
/usr/libexec/bootpd (for architecture x86_64):  Mach-O 64-bit executable x86_64
/usr/libexec/bootpd (for architecture arm64e):  Mach-O 64-bit executable arm64e
AkihiroSuda commented 1 year ago

A rumor is that socketfilterfw doesn't work on Japanese macOS: https://gist.github.com/techraf/ef5a6aae636f52eec09b?permalink_comment_id=2974356#gistcomment-2974356

A customer has the same effect "The file path you specified does not exist", in all cases, with or without sudo, but only when macOS is in Japanese.

mprimeaux commented 1 year ago

The gist is interesting and caused me to run the same socketfilterfw workaround on my Intel-based Mac with macOS 14 ; socketfilterfw is indeed present on the US English version of the OS.

file /usr/libexec/ApplicationFirewall/socketfilterfw
/usr/libexec/ApplicationFirewall/socketfilterfw: Mach-O universal binary with 2 architectures: [x86_64:Mach-O 64-bit executable x86_64] [arm64e:Mach-O 64-bit executable arm64e]
/usr/libexec/ApplicationFirewall/socketfilterfw (for architecture x86_64):  Mach-O 64-bit executable x86_64
/usr/libexec/ApplicationFirewall/socketfilterfw (for architecture arm64e):  Mach-O 64-bit executable arm64e

file /usr/libexec/bootpd
/usr/libexec/bootpd: Mach-O universal binary with 2 architectures: [x86_64:Mach-O 64-bit executable x86_64] [arm64e:Mach-O 64-bit executable arm64e]
/usr/libexec/bootpd (for architecture x86_64):  Mach-O 64-bit executable x86_64
/usr/libexec/bootpd (for architecture arm64e):  Mach-O 64-bit executable arm64e

I had the pleasure of living in Tokyo for a few years and still keep in contact with my friends there; I'm sure they will be interested in the gist. Thanks for sharing!

Of course, the big question is "why does socketfilterfw not exist on the Japanese macOS?".

mprimeaux commented 1 year ago

It would be interesting to know if socketfilterfw exists on an M1 or M2 with Japanese macOS.

AkihiroSuda commented 1 year ago

/usr/libexec/ApplicationFirewall/socketfilterfw exists, and /usr/libexec/bootpd exists too, but /usr/libexec/ApplicationFirewall/socketfilterfw thinks that /usr/libexec/bootpd does not exist 🤔

mprimeaux commented 1 year ago

Ah. I thought the exception "The file path you specified does not exist" was from file not finding /usr/libexec/ApplicationFirewall/socketfilterfw. 🤔 is right

madalinignisca commented 5 months ago

I have the same problem on Intel mac. Using macOS 14.5 and lima 0.22.0.

Neither lima:shared or vzNat will get on the second interface an ip.

Which is the service expected to provide dhcpd and maybe there must be some one time setup before creating vms to do?

I'm posting this comment like 1h after I tried Lima for the first time. Except not getting an ip, following all in the docs, everything else worked as expected.

Inspecting the network after getting started with Lima, I noticed a new bridge100 interface with 2 member interfaces vmenet0 and vmenet2. The bridge has 192.168.105.1 setup, but if Lima was supposed to configure some dhcpd service on the host or instruct virtualization framework for doing it, it didn't worked.