lima-vm / lima

Linux virtual machines, with a focus on running containers
https://lima-vm.io/
Apache License 2.0
15.35k stars 602 forks source link

experiencing bizarre network-related issues #537

Closed leifliddy closed 2 years ago

leifliddy commented 2 years ago

I'm experiencing some pretty bizarre network-related issue with Lima.

  1. duplicate ICMP packet issue,
[lima@lima-default ~]$ ping 1.1.1.1 -s 16
PING 1.1.1.1 (1.1.1.1) 16(44) bytes of data.
ping: Warning: time of day goes back (-3352826442367348081us), taking countermeasures
ping: Warning: time of day goes back (-3352826442367347601us), taking countermeasures
24 bytes from 1.1.1.1: icmp_seq=0 ttl=255 time=0.000 ms
ping: Warning: time of day goes back (-161817004574526878us), taking countermeasures
24 bytes from 1.1.1.1: icmp_seq=0 ttl=255 time=0.000 ms (DUP!)
24 bytes from 1.1.1.1: icmp_seq=0 ttl=255 time=887964419286835 ms (DUP!)
24 bytes from 1.1.1.1: icmp_seq=0 ttl=255 time=1865846997829180 ms (DUP!)
24 bytes from 1.1.1.1: icmp_seq=0 ttl=255 time=2910382233647161 ms (DUP!)
24 bytes from 1.1.1.1: icmp_seq=0 ttl=255 time=5167524957939631 ms (DUP!)
  1. ICMP packet size issue I see this when the ICMP packets are over 16 bytes in size.

    [lima@lima-default ~]$ ping 1.1.1.1
    PING 1.1.1.1 (1.1.1.1) 56(84) bytes of data.
    64 bytes from 1.1.1.1: icmp_seq=0 ttl=255 time=6493448579855972 ms
    wrong data byte #16 should be 0x10 but was 0x0
    #16 0 17 0 1 84 17 db 61 0 0 0 0 8a 87 c 0 0 0 0 0 10 11 12 13 14 15 16 17 18 19 1a 1b 
    #48 1c 1d 1e 1f 20 21 22 23 
    ping: Warning: time of day goes back (-8723561754788627594us), taking countermeasures
    ping: Warning: time of day goes back (-8723561754788627299us), taking countermeasures
    64 bytes from 1.1.1.1: icmp_seq=0 ttl=255 time=0.000 ms (DUP!)
    wrong data byte #16 should be 0x10 but was 0x0
    #16 0 17 0 2 85 17 db 61 0 0 0 0 57 b3 c 0 0 0 0 0 10 11 12 13 14 15 16 17 18 19 1a 1b 
    #48 1c 1d 1e 1f 20 21 22 23 
    ping: Warning: time of day goes back (-6579073383962294186us), taking countermeasures
    64 bytes from 1.1.1.1: icmp_seq=0 ttl=255 time=0.000 ms (DUP!)
    wrong data byte #16 should be 0x10 but was 0x0
    #16 0 17 0 3 86 17 db 61 0 0 0 0 1d d0 c 0 0 0 0 0 10 11 12 13 14 15 16 17 18 19 1a 1b 
    #48 1c 1d 1e 1f 20 21 22 23
  2. Various other issues: It's pretty common to see TLS timeout issue issue like this

    Error: error creating build container: parsing image configuration: Get "https://production.cloudflare.docker.com/registry-v2/docker/registry/v2/blobs/sha256/ee/eeb6ee3f44bd0b5103bb561b4c16bcb82328cfe5809ab675bb17ab3a16c517c9/data?verify=1641747317-BL53mt2Tv8RkI2dPZh8AAv7YtEM%3D": net/http: TLS handshake timeout

    --although it normally works on the 2nd or 3rd attempt

There's other issues I'm encountering that are likely related to this. For example when downloading/installing packages from within a container --it can take a while as the system often hangs anywhere from 30 seconds to a couple minutes before starting to download the packages.
Also, with a centos7 container, the fastest mirror plugin doesn't work at all --it somehow identifies servers half a world away as being the "fastest mirrors".
I have no idea what could be causing these issues. I've tried changing the MTU sizes on Lima VM and on the macOS interfaces --to no avail. I've tried various sysctl settings, flushing the iptables nat rules...etc. I'm not running a VPN or anything.
Aso this issue occurs regarding of which Linux distro is used as the base Lima VM. I don't experience this issue with VirtualBox or with Docker Desktop.

I'm just wondering whether anyone else is experiencing the same thing...

afbjorklund commented 2 years ago

See https://wiki.qemu.org/Documentation/Networking:

Note - if you are using the (default) SLiRP user networking, then ping (ICMP) will not work, though TCP and UDP will. Don't try to use ping to test your QEMU network configuration!

If you need advanced networking, you probably need something more than the built-in "user" network ?

https://github.com/lima-vm/lima/blob/master/docs/network.md

The basic stuff seemed to work on Linux though, so it could be something specific to QEMU on the Mac.

Note: all other network modes require administrative privileges

leifliddy commented 2 years ago

Thanks for the explanation.

afbjorklund commented 2 years ago

Thanks for the explanation.

I'm not sure it's a good one, and it's not good if it is that unstable. But it has not been my experience, when running on Linux.

leifliddy commented 2 years ago

Yeah, something isn't right. If I run this on a Fedora Lima VM (although the distro doesn't really matter), I get this:

[lima@lima-default ~]$ curl 'http://mirrorlist.centos.org/?repo=os&arch=x86_64&release=7'
http://mirrors.greenmountainaccess.net/centos/7.9.2009/os/x86_64/
http://mirrors.seas.harvard.edu/centos/7.9.2009/os/x86_64/
http://mirror.arizona.edu/centos/7.9.2009/os/x86_64/
http://mirror.es.its.nyu.edu/centos/7.9.2009/os/x86_64/
....

This is how CentOS determines what the fastest mirrors are. And this list is totally bogus.

If run this from a non-virtualized Fedora system, I get a much more accurate list:

[leif.liddy@black ~]$ curl 'http://mirrorlist.centos.org/?repo=os&arch=x86_64&release=7'
http://de.mirrors.clouvider.net/CentOS/7.9.2009/os/x86_64/
http://mirror.netzwerge.de/centos/7.9.2009/os/x86_64/
http://linux.darkpenguin.net/distros/CentOS/7.9.2009/os/x86_64/
http://mirror1.hs-esslingen.de/pub/Mirrors/centos/7.9.2009/os/x86_64/
...

I don't get why these lists vary to such a large degree, I mean we're not even on the same continent.

So, I ran this python script to gauge the latency of reaching these US servers, and the results varied widely between Lima/qemu vs non-virtualized Linux.

#!/usr/bin/python3

import sys
import time
import socket

def main():
    if len(sys.argv) == 1:
        print("Usage: %s <webserver>" % sys.argv[0])
        sys.exit(-1)

    sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    time_before = time.time()
    sock.connect((sys.argv[1], 80))
    result = (time.time() - time_before) * 1000
    sock.close()
    print(sys.argv[1] + ": %f ms" % result)

if __name__ == '__main__':
    main()

Lima/Qemu

[lima@lima-default ~]$ ./script.py mirrors.seas.harvard.edu
mirrors.seas.harvard.edu: 22.561073 ms

[lima@lima-default ~]$ ./script.py mirror.us.oneandone.net
mirror.us.oneandone.net: 22.254229 ms

[lima@lima-default ~]$ ./script.py mirror.arizona.edu
mirror.arizona.edu: 22.699356 ms

[lima@lima-default ~]$ ./script.py repos.lax.layerhost.com
repos.lax.layerhost.com: 23.733139 ms

non-virtualized Linux

[leif.liddy@black ~]$ ./script.py mirrors.seas.harvard.edu
mirrors.seas.harvard.edu: 99.695206 ms

[leif.liddy@black ~]$ ./script.py mirror.us.oneandone.net
mirror.us.oneandone.net: 123.170376 ms

[leif.liddy@black ~]$ ./script.py mirror.arizona.edu
mirror.arizona.edu: 172.051907 ms

[leif.liddy@black ~]$ ./script.py repos.lax.layerhost.com
repos.lax.layerhost.com: 158.929825 ms

Something is seriously off here. To be honest, this is turning me off from using Qemu at all (well...on a mac anyway)

afbjorklund commented 2 years ago

Did it help to run VDE instead ?

leifliddy commented 2 years ago

I wouldn't doubt that it would work with VDE, but I'm looking for the easiest way to automate the creation of a podman VM.
There's just too many steps involved when you have to manually compile something. I mean, it would fine for myself but when you need to automate something for other people to run, it's best to keep things as simple as possible --simply because there's less that could go wrong. I understand that the Qemu SLiRP network backend has many limitations, a lot overhead, poor performance..etc. If it was just a little bit better though, it would be perfect. I just need it to accurately gauge the latency of a connection....that's it.
I'll probably automate the creation of a virtual machine running podman with python + vagrant + virtualbox for now.
But I'll keep an eye on Qemu and see how it progresses...
Thanks for your help!

afbjorklund commented 2 years ago

Understand. Can't say that Podman's gvisor-tap-vsock is without issues either, but at least those are new bugs :-)

I was using Parallels on the Mac before I changed over to Linux, but it was neither open source nor without cost.


The old podman/docker vagrant scripts are still up on https://boot2podman.github.io/2020/07/22/machine-replacement.html

Would probably be rather simple to create a similar setup for nerdctl, but it wouldn't have the lima level of nice integration.

leifliddy commented 2 years ago

I was using Parallels on the Mac before I changed over to Linux, but it was neither open source nor without cost.

TBH, I can't stand using a mac. Wish my company would allow me to develop on a Linux machine.

The old podman/docker vagrant scripts are still up on

Thanks for that! I don't really care about the lima level of integration though. Actually, yeah I would need to mount the current directory within the VM so podman would have access to it (would need to test that out). But would mainly be interacting with podman via python + the podman API. Like this script for example https://github.com/leifliddy/podman-clamav-el6/blob/master/script-podman.py

**would need to be modified a bit to run macos, but you get the idea...

leifliddy commented 2 years ago

Update: I'm also experiencing this issue running Linux VM's in VirtualBox on a mac. This issue only affects NAT adapter (bridge adapters work fine). This command sorts out the issue for some reason VBoxManage modifyvm minikube --natbindip1 192.168.3.35

The 192.x address is just the main IP of my host system. ie

[leif.liddy@pro.example.com ~]$ netstat -rn | grep ^default | grep 'UGSc '
default            192.168.3.1        UGSc           en5  

[leif.liddy@pro.example.com ~]$ ipconfig getifaddr en5
192.168.3.35

The latency can be measured perfectly now. The fastest mirror plugin works....etc. But, I need to do some more troubleshooting to sort why that worked.

According to the VB documentation

By default, Oracle VM VirtualBox's NAT engine will route TCP/IP packets through the default interface assigned by the host's TCP/IP stack. The technical reason for this is that the NAT engine uses sockets for communication. If you want to change this behavior, you can tell the NAT engine to bind to a particular IP address instead. For example, use the following command:

$ VBoxManage modifyvm VM-name \
--natbindip1 "10.45.0.2"

Hmm...I wonder what "default interface" it's using. I'm willing to bet that the the root cause of this issue is the same for Qemu and VirtualBox.

Update: I found the culprit. So all traffic from the NAT adapters are being routed out through a utun interface which is created + used by a "security" web proxy app/service called zscaler. I'm based in Europe and this proxy service is routing all web traffic to US-based servers...which seems to be causing a whole host of issues.

Ok, I think I got this now. Sorry for wasting everyone's time with this.