Open cfergeau opened 1 year ago
Regarding #20639, I asked Florent
delve
does not support universal macos binaries: lipo -extract arm64 -output gvproxy-darwin-arm64 ./gvproxy-darwin
gvproxy-darwin-arm64
binarybrew install delve
$ dlv attach $(pgrep gvproxy)
(dlv) trace /github.com\/containers\/gvisor-tap-vsock\/*/
When the tracing is done, it's possible to detach `dlv` from the process by pressing `ctrl+c` and answering 'no' when delve asks if the process should be killed.
Regarding https://github.com/containers/podman/issues/20639, one suggestion from @n1hility was to try to use vm
/gvforwarder
in the VM, and sends the network traffic over vsock rather than directly over virtio-net to see if the bug can still be reproduced.
Sometimes, after a while, podman machine networking, or crc networking stops working. No clear reproducer, but was hit by people working on podman-desktop, by some crc users, ... Latest such issue is: https://github.com/containers/podman/issues/20639 The common symptom is that ssh access to the VM does not work.
modprobe -r virtio-net && modprobe virtio-net
gets the network back up in #20639.Currently working with Florent who filed #20639 and who can reproduce it several times per week to get some traces through
dlv
to see if this gives a hint as to what's going on. This could be a gvproxy bug as much as a kernel or qemu bug.Regarding the other similar bugs which have been filed/mentioned in the past, they may have the same root cause, or not. They happened on Windows + hyperv, on macos + vfkit, and I think even on linux + libvirt/qemu.
20639 was macos + qemu. This means this both happens with
gvproxy
, and withcrc daemon
+vm
process running in the VM.There were hints of a crc daemon crash/restart in the linux + qemu case, but not in #20639, which is why I'm thinking there could be different issues.