LabNConsulting / munet

Create network topologies, running programs and containers, using namepsaces, podman containers and qemu virtual machines.
GNU General Public License v2.0
23 stars 7 forks source link

munet does not support qemu > 4.2.1 #38

Open adudek16 opened 10 hours ago

adudek16 commented 10 hours ago

Munet seems to have issues connecting to VMs under qemu 6.2.0 (default version with 22.04) and later.

Testing Setup Ubuntu 22.04.4 Linux 8kubu-22 5.15.0-118-generic #128-Ubuntu SMP Fri Jul 5 09:28:59 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux qemu-system-x86_64 --version
QEMU emulator version 6.2.0 (Debian 1:6.2+dfsg-2ubuntu6.22) Copyright (c) 2003-2021 Fabrice Bellard and the QEMU Project developers $ sudo apparmor_status apparmor module is loaded. 1 profiles are loaded. 1 profiles are in enforce mode. docker-default 0 profiles are in complain mode. 0 profiles are in kill mode. 0 profiles are in unconfined mode. 0 processes have profiles defined. 0 processes are in enforce mode. 0 processes are in complain mode. 0 processes are unconfined but have a profile defined. 0 processes are in mixed mode. 0 processes are in kill mode.

$ sudo -E munet 2024-10-02 16:26:36,144: INFO: Loaded logging config /usr/local/lib/python3.10/dist-packages/munet/logconf.yaml 2024-10-02 16:26:36,151: INFO: Loaded config from /home/adudek/1node_8k/munet.yaml 2024-10-02 16:26:36,168: INFO: Loading kinds config from /usr/local/lib/python3.10/dist-packages/munet/kinds.yaml 2024-10-02 16:26:36,480: INFO: Munet(munet): created 2024-10-02 16:26:36,957: INFO: L3QemuVM(r1): created 2024-10-02 16:26:37,155: INFO: Topology up: rundir: /tmp/munet 2024-10-02 16:26:37,156: INFO: L3QemuVM(r1): Launch Qemu 2024-10-02 16:26:37,325: INFO: Create disk '/tmp/munet/r1/../r1-disk.qcow2' from template '/home/adudek/1node_8k/c8000v-universalk9_8G_serial.17.06.06a.qcow2' 2024-10-02 16:26:37,737: INFO: Exiting, unexpected exception [Errno 104] Connection reset by peer Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/munet/main.py", line 74, in async_main status = await run_and_wait(args, unet) File "/usr/local/lib/python3.10/dist-packages/munet/main.py", line 39, in run_and_wait tasks += await unet.run() File "/usr/local/lib/python3.10/dist-packages/munet/native.py", line 3104, in run await asyncio.gather(*[x.launch() for x in launch_nodes]) File "/usr/local/lib/python3.10/dist-packages/munet/native.py", line 2495, in launch cons = await self._opencons( File "/usr/local/lib/python3.10/dist-packages/munet/native.py", line 2218, in _opencons await self.console( File "/usr/local/lib/python3.10/dist-packages/munet/native.py", line 867, in console repl = await self.shell_spawn( File "/usr/local/lib/python3.10/dist-packages/munet/base.py", line 760, in shell_spawn p = self.spawn( File "/usr/local/lib/python3.10/dist-packages/munet/base.py", line 663, in spawn index = p.expect([spawned_re, pexpect.TIMEOUT, pexpect.EOF], timeout=0.1) File "/usr/lib/python3/dist-packages/pexpect/spawnbase.py", line 343, in expect return self.expect_list(compiled_pattern_list, File "/usr/lib/python3/dist-packages/pexpect/spawnbase.py", line 372, in expect_list return exp.expect_loop(timeout) File "/usr/lib/python3/dist-packages/pexpect/expect.py", line 169, in expect_loop incoming = spawn.read_nonblocking(spawn.maxread, timeout) File "/usr/lib/python3/dist-packages/pexpect/fdpexpect.py", line 148, in read_nonblocking return super(fdspawn, self).read_nonblocking(size) File "/usr/lib/python3/dist-packages/pexpect/spawnbase.py", line 169, in read_nonblocking s = os.read(self.child_fd, size) ConnectionResetError: [Errno 104] Connection reset by peer 2024-10-02 16:26:37,880: WARNING: L3QemuVM(r1): [cleanup_proc] kill unexpected exception: [Errno 3] No such process Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/munet/base.py", line 892, in async_cleanup_proc await self.cleanup_pid(p.pid, pid) File "/usr/local/lib/python3.10/dist-packages/munet/base.py", line 432, in cleanup_pid os.kill(kill_pid, sn) ProcessLookupError: [Errno 3] No such process 2024-10-02 16:26:37,983: INFO: L3QemuVM(r1): deleted 2024-10-02 16:26:38,099: INFO: L3Bridge(net0): deleted 2024-10-02 16:26:38,127: INFO: L3Bridge(mgmt0): deleted 2024-10-02 16:26:38,229: INFO: Munet(munet): deleted

On a subsequent run k$ sudo -E munet 2024-10-02 16:27:02,669: INFO: Loaded logging config /usr/local/lib/python3.10/dist-packages/munet/logconf.yaml 2024-10-02 16:27:02,676: INFO: Loaded config from /home/adudek/1node_8k/munet.yaml 2024-10-02 16:27:02,690: INFO: Loading kinds config from /usr/local/lib/python3.10/dist-packages/munet/kinds.yaml 2024-10-02 16:27:02,876: INFO: Munet(munet): created 2024-10-02 16:27:03,344: INFO: L3QemuVM(r1): created 2024-10-02 16:27:03,528: INFO: Topology up: rundir: /tmp/munet 2024-10-02 16:27:03,529: INFO: L3QemuVM(r1): Launch Qemu 2024-10-02 16:27:03,959: WARNING: can't open console socket: Connection refused 2024-10-02 16:27:03,959: INFO: Exiting, unexpected exception [Errno 111] Connection refused Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/munet/main.py", line 74, in async_main status = await run_and_wait(args, unet) File "/usr/local/lib/python3.10/dist-packages/munet/main.py", line 39, in run_and_wait tasks += await unet.run() File "/usr/local/lib/python3.10/dist-packages/munet/native.py", line 3104, in run await asyncio.gather(*[x.launch() for x in launch_nodes]) File "/usr/local/lib/python3.10/dist-packages/munet/native.py", line 2495, in launch cons = await self._opencons( File "/usr/local/lib/python3.10/dist-packages/munet/native.py", line 2194, in _opencons sock.connect(sockpath) ConnectionRefusedError: [Errno 111] Connection refused 2024-10-02 16:27:04,183: INFO: L3QemuVM(r1): deleted 2024-10-02 16:27:04,311: INFO: L3Bridge(mgmt0): deleted 2024-10-02 16:27:04,331: INFO: L3Bridge(net0): deleted 2024-10-02 16:27:04,433: INFO: Munet(munet): deleted

adudek16 commented 10 hours ago

$ more r1-mutini.log 2024-10-02 16:27:03,179 mutini: DEBUG: unshareing with flags: CLONE_NEWNS|CLONE_NEWUTS|CLONE_NEWNET 2024-10-02 16:27:03,184 mutini: DEBUG: remount / recursive private 2024-10-02 16:27:03,184 mutini: INFO: holding namespace waiting to be signaled to exit

/tmp/munet/r1$ more qemu.out /tmp/munet/r1$ more qemu.err qemu-system-x86_64: error: failed to set MSR 0x345 to 0x2000 qemu-system-x86_64: ../../target/i386/kvm/kvm.c:2893: kvm_buf_set_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.

adudek16 commented 10 hours ago

munet-exec.log munet-mutini.log r1-mutini.log

liambrady commented 7 hours ago

From what I can gather, this may be an issue with qemu rather than munet.

/tmp/munet/r1$ more qemu.out /tmp/munet/r1$ more qemu.err qemu-system-x86_64: error: failed to set MSR 0x345 to 0x2000 qemu-system-x86_64: ../../target/i386/kvm/kvm.c:2893: kvm_buf_set_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.

This information suggests specifically that there is some issue regarding a module-specific register (MSR), likely intel-related due to the presence of i386, that is incorrectly being accessed in newer versions of qemu. This would make sense since I recall that you have some sort of Mac (possibly with an M-series chip which are notorious for causing problems). I am unsure as to whether there is a solution or work around at the moment.

For reference, I am able to successfully start a VM in Ubuntu 22.04 using munet and qemu 6.2.0 (even with apparmor enabled.) My CPU model is the intel i9-13900K and kernel is 5.15.0-122-generic

adudek16 commented 6 hours ago

This is not being run on a Mac, but and Intel server.

8kubu-22:~$ lscpu
Architecture:             x86_64
  CPU op-mode(s):         32-bit, 64-bit
  Address sizes:          40 bits physical, 48 bits virtual
  Byte Order:             Little Endian
CPU(s):                   8
  On-line CPU(s) list:    0-7
Vendor ID:                GenuineIntel
  Model name:             Intel Core Processor (Broadwell, IBRS)
    CPU family:           6
    Model:                61
    Thread(s) per core:   1
    Core(s) per socket:   1
    Socket(s):            8
    Stepping:             2
    BogoMIPS:             4389.40
    Flags:                fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss sy
                          scall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid tsc_known_freq pni pclmulqdq vmx
                          ssse3 fma cx16 pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand
                           hypervisor lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vn
                          mi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx
                           smap xsaveopt arat umip md_clear arch_capabilities
Virtualization features:
  Virtualization:         VT-x
  Hypervisor vendor:      KVM
  Virtualization type:    full
Caches (sum of all):
  L1d:                    256 KiB (8 instances)
  L1i:                    256 KiB (8 instances)
  L2:                     32 MiB (8 instances)
  L3:                     128 MiB (8 instances)
NUMA:
  NUMA node(s):           1
  NUMA node0 CPU(s):      0-7

8kubu-22:~$

This is a clean install, so I am curious on what diffs there are between your setup and mine.

liambrady commented 4 hours ago

Very interesting, here is the results of my lscpu

lscpu
Architecture:             x86_64
  CPU op-mode(s):         32-bit, 64-bit
  Address sizes:          46 bits physical, 48 bits virtual
  Byte Order:             Little Endian
CPU(s):                   8
  On-line CPU(s) list:    0-7
Vendor ID:                GenuineIntel
  Model name:             13th Gen Intel(R) Core(TM) i9-13900K
    CPU family:           6
    Model:                183
    Thread(s) per core:   2
    Core(s) per socket:   4
    Socket(s):            1
    Stepping:             1
    BogoMIPS:             5990.40
    Flags:                fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology tsc_reli
                          able nonstop_tsc cpuid aperfmperf pni pclmulqdq vmx ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dno
                          wprefetch invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt clwb s
                          ha_ni xsaveopt xsavec xgetbv1 xsaves avx_vnni umip waitpkg gfni vaes vpclmulqdq rdpid movdiri movdir64b fsrm md_clear serialize flush_l1d arch_capabilities
Virtualization features:  
  Virtualization:         VT-x
  Hypervisor vendor:      Microsoft
  Virtualization type:    full
Caches (sum of all):      
  L1d:                    192 KiB (4 instances)
  L1i:                    128 KiB (4 instances)
  L2:                     8 MiB (4 instances)
  L3:                     36 MiB (1 instance)
NUMA:                     
  NUMA node(s):           1
  NUMA node0 CPU(s):      0-7
Vulnerabilities:          
  Gather data sampling:   Not affected
  Itlb multihit:          Not affected
  L1tf:                   Not affected
  Mds:                    Not affected
  Meltdown:               Not affected
  Mmio stale data:        Not affected
  Reg file data sampling: Mitigation; Clear Register File
  Retbleed:               Mitigation; Enhanced IBRS
  Spec rstack overflow:   Not affected
  Spec store bypass:      Mitigation; Speculative Store Bypass disabled via prctl and seccomp
  Spectre v1:             Mitigation; usercopy/swapgs barriers and __user pointer sanitization
  Spectre v2:             Mitigation; Enhanced / Automatic IBRS; IBPB conditional; RSB filling; PBRSB-eIBRS SW sequence; BHI BHI_DIS_S
  Srbds:                  Not affected
  Tsx async abort:        Not affected

It also is a clean install...