abiosoft / colima

Container runtimes on macOS (and Linux) with minimal setup
MIT License
19.4k stars 390 forks source link

Unable to connect to hosts through VPN interface from colima VM #392

Open chriscasola opened 2 years ago

chriscasola commented 2 years ago

Description

Initially when creating a colima VM, I can do something like colima ssh and then curl -v internal.corporate-domain.com and successfully connect and get back a response.

But at a later time, sometimes weeks/months later, all networking breaks between the VM (and any docker containers running on the VM) and the internal corporate network. Restarting the computer and/or the colima VM does not resolve the issue. The only resolution is to tear down the colima VM and create a new one.

The VPN on the mac creates an interface like this: inet 172.19.21.19 --> 172.19.21.19 netmask 0xffffffff There is also a physical network interface on the mac: inet 192.168.86.42 netmask 0xffffff00 broadcast 192.168.86.255

Here is some debugging output from the colima VM:

successful nslookup

colima:/Users/ccasola$ nslookup xxx-staging.xxx.com
Server:     192.168.107.1
Address:    192.168.107.1:53

Non-authoritative answer:
Can't find xxx-staging.xxx.com: No answer

Non-authoritative answer:
Name:   xxx-staging.xxx.com
Address: 172.21.132.52

curl fails to connect

colima:/Users/ccasola$ curl -v http://xxx-staging.xxx.com/
*   Trying 172.21.132.52:80...
* connect to 172.21.132.52 port 80 failed: Host is unreachable
* Failed to connect to xxx-staging.xxx.com port 80 after 3097 ms: Host is unreachable
* Closing connection 0
curl: (7) Failed to connect to xxx-staging.xxx.com port 80 after 3097 ms: Host is unreachable

ifconfig on colima vm

output ```bash colima:/Users/ccasola$ ifconfig br-861424ecd59b Link encap:Ethernet HWaddr 02:42:52:40:74:7C inet addr:172.20.0.1 Bcast:172.20.255.255 Mask:255.255.0.0 UP BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) br-aa39713c361f Link encap:Ethernet HWaddr 02:42:68:34:DE:D0 inet addr:172.21.0.1 Bcast:172.21.255.255 Mask:255.255.0.0 UP BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) br-cf68d4a5c6cf Link encap:Ethernet HWaddr 02:42:1A:D5:88:4E inet addr:172.18.0.1 Bcast:172.18.255.255 Mask:255.255.0.0 UP BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) docker0 Link encap:Ethernet HWaddr 02:42:93:3F:AD:47 inet addr:172.17.0.1 Bcast:172.17.255.255 Mask:255.255.0.0 UP BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) eth0 Link encap:Ethernet HWaddr 52:55:55:83:2B:8F inet addr:192.168.5.15 Bcast:0.0.0.0 Mask:255.255.255.0 inet6 addr: fe80::5055:55ff:fe83:2b8f/64 Scope:Link inet6 addr: fec0::5055:55ff:fe83:2b8f/64 Scope:Site UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:405 errors:0 dropped:0 overruns:0 frame:0 TX packets:354 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:41947 (40.9 KiB) TX bytes:43449 (42.4 KiB) eth1 Link encap:Ethernet HWaddr 5A:94:EF:B8:ED:B2 inet addr:192.168.107.2 Bcast:0.0.0.0 Mask:255.255.255.0 inet6 addr: fe80::5894:efff:feb8:edb2/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:22 errors:0 dropped:0 overruns:0 frame:0 TX packets:35 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:1668 (1.6 KiB) TX bytes:2434 (2.3 KiB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:9 errors:0 dropped:0 overruns:0 frame:0 TX packets:9 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:792 (792.0 B) TX bytes:792 (792.0 B) ```

Version

Colima Version:

colima version 0.4.4
git commit: 8bb1101a861a8b6d2ef6e16aca97a835f65c4f8f

runtime: docker
arch: aarch64
client: v20.10.17
server: v20.10.11

Lima Version:

limactl version 0.11.3

Qemu Version:

qemu-img version 7.0.0
Copyright (c) 2003-2022 Fabrice Bellard and the QEMU Project developers

Operating System

Reproduction Steps

  1. colima start
  2. colima ssh
  3. curl -v corporate-app.corporation.com

Expected behaviour

curl should be able to connect to the host.

Additional context

This is not isolated to a single person, multiple people have been experiencing this at our company. Thanks in advance for any help you can provide!

abiosoft commented 2 years ago

@chriscasola any idea in what version you started noticing this?

chriscasola commented 2 years ago

We're not sure because it has occurred very sporadically. But it's been happening for at least a couple months.

On Aug 11, 2022, at 12:14 PM, Abiola Ibrahim @.***> wrote:

 @chriscasola any idea in what version you started noticing this?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.

abiosoft commented 2 years ago

The only resolution is to tear down the colima VM and create a new one.

@chriscasola you mean that it works fine on a newly created VM but starts to malfunction after a while, until you teardown and recreate it again?

chriscasola commented 2 years ago

you mean that it works fine on a newly created VM but starts to malfunction after a while, until you teardown and recreate it again?

@abiosoft correct.

chriscasola commented 2 years ago

@abiosoft this happens daily for us. If there's any steps we can take to debug, while the VM is in a bad state, let me know.

abiosoft commented 2 years ago

@chriscasola did you install colima via brew?

chriscasola commented 2 years ago

Yes, I did.

On Wed, Aug 17, 2022 at 11:03 AM Abiola Ibrahim @.***> wrote:

@chriscasola https://github.com/chriscasola did you install colima via brew?

— Reply to this email directly, view it on GitHub https://github.com/abiosoft/colima/issues/392#issuecomment-1218132977, or unsubscribe https://github.com/notifications/unsubscribe-auth/AARNEHGPMXJCQRSTLFH4UYTVZT5LRANCNFSM56A4OCEQ . You are receiving this because you were mentioned.Message ID: @.***>

abiosoft commented 2 years ago

@chriscasola This PR enables customising the network driver https://github.com/abiosoft/colima/pull/399. ~Once the PR is merged,~ can you try starting afresh with slirp network?

brew install --HEAD colima # install development version
colima delete # delete existing instance for clean behaviour
colima start --network-driver slirp
chriscasola commented 2 years ago

@abiosoft I tried the slirp driver today but I ran into the same issue after about 6 hours of working.

abiosoft commented 2 years ago

@abiosoft I tried the slirp driver today but I ran into the same issue after about 6 hours of working.

@chriscasola any idea if this happens after resuming your Mac from sleep? And a simple restart does not fix it?

chriscasola commented 2 years ago

Restarting the VM usually works but sometimes it does not and I have to delete the VM.

No my Mac did not sleep at all today between creating the new VM and when it hit the issue.

On Aug 18, 2022, at 4:23 PM, Abiola Ibrahim @.***> wrote:

 @abiosoft I tried the slirp driver today but I ran into the same issue after about 6 hours of working.

@chriscasola any idea if this happens after resuming your Mac from sleep? And a simple restart does not fix it?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.

abiosoft commented 2 years ago

@chriscasola can you kindly share the output the following in both scenarios. i.e. when it is working fine and when it stops working.

colima ssh -- ip route
chriscasola commented 2 years ago

This may be a separate but related issue, but I seem to be hitting connection errors when my Mac switches from ethernet to WiFi. The other day when I initially tried the slirp driver I still eventually had connection issues even though I was on ethernet only the entire time. Here is the output from the command you requested today:

Initial Start (on Ethernet)

$ colima -p slirp ssh -- ip route
default via 192.168.5.2 dev eth0  metric 202 
172.17.0.0/16 dev docker0 scope link  src 172.17.0.1 
172.18.0.0/16 dev br-1331da1855ba scope link  src 172.18.0.1 
192.168.5.0/24 dev eth0 scope link  src 192.168.5.15

After getting connection errors (on WiFi)

$ colima -p slirp ssh -- ip route
default via 192.168.5.2 dev eth0  metric 202 
172.17.0.0/16 dev docker0 scope link  src 172.17.0.1 
172.18.0.0/16 dev br-1331da1855ba scope link  src 172.18.0.1 
172.19.0.0/16 dev br-77ac2751083e scope link  src 172.19.0.1 
192.168.5.0/24 dev eth0 scope link  src 192.168.5.15

After restarting the colima VM (still on WiFi, still getting connection errors)

$ colima -p slirp ssh -- ip route
default via 192.168.5.2 dev eth0  metric 202 
172.17.0.0/16 dev docker0 scope link  src 172.17.0.1 
172.18.0.0/16 dev br-1331da1855ba scope link  src 172.18.0.1 
172.19.0.0/16 dev br-77ac2751083e scope link  src 172.19.0.1 
192.168.5.0/24 dev eth0 scope link  src 192.168.5.15 
chriscasola commented 2 years ago

Still digging, but it seems like deleting and recreating the docker network I'm using resolves the issue.

I also ran into DNS resolution issues, where the DNS server I specified with the start command was being replaced in /etc/resolv.conf with some other IP. That seem to be fixed by doing the following:

colima start -d <my-dns-ip>
colima ssh
sudo -i
mkdir /etc/udhcpc/
echo "RESOLV_CONF=\"no\"" > /etc/udhcpc/udhcpc.conf
echo "nameserver <my-dns-ip>" > /etc/resolv.conf
abiosoft commented 2 years ago

@chriscasola thanks for the information. I will dig a bit more on my end.

chriscasola commented 2 years ago

Just confirming that the two workarounds in this comment do resolve this issue for me.

lracicot commented 1 year ago

I have the same issue on Apple M1, and the problem is not DNS. Even trying to reach the IP directly works from my host and not from the VM.