Closed Luap99 closed 3 months ago
I run into this issue pretty often with some pods that use a macvlan network:
Jun 21 04:28:07 test netavark[4665]: [ERROR mozim::msg] InvalidDhcpServerReply: Failed to parse DHCP message from payload of pkg []: parser ran out of data-- not enough bytes
Jun 21 04:28:07 test kernel: virtio_net virtio1 eth0: left promiscuous mode
Jun 21 04:28:07 test iot-sample[6614]: time="2024-06-21T04:28:07Z" level=error msg="Starting some container dependencies"
Jun 21 04:28:07 test iot-sample[6614]: time="2024-06-21T04:28:07Z" level=error msg="\"netavark: unable to obtain lease: dhcp proxy error: status: Internal, message: \\\"InvalidDhcpServerReply: Failed to parse DHCP message from payload of pkg []: parser ran out of data-- not enough bytes\\\", details: [], metadata: MetadataMap { headers: {\\\"content-type\\\": \\\"application/grpc\\\", \\\"date\\\": \\\"Fri, 21 Jun 2024 04:28:07 GMT\\\", \\\"content-length\\\": \\\"0\\\"} }\""
Jun 21 04:28:07 test iot-sample[6614]: Error: starting some containers: internal libpod error
Jun 21 04:28:07 test systemd[1]: iot-sample.service: Main process exited, code=exited, status=125/n/a
Jun 21 04:28:07 test systemd[1]: iot-sample.service: Killing process 6652 (conmon) with signal SIGKILL.
Even with latest netavark version:
$ podman version
Client: Podman Engine
Version: 5.1.1
API Version: 5.1.1
Go Version: go1.22.3
Built: Tue Jun 4 00:00:00 2024
OS/Arch: linux/amd64
$ podman info
...
networkBackend: netavark
networkBackendInfo:
backend: netavark
dns:
package: aardvark-dns-1.11.0-1.fc40.x86_64
path: /usr/libexec/podman/aardvark-dns
version: aardvark-dns 1.11.0
package: netavark-1.11.0-1.fc40.x86_64
path: /usr/libexec/podman/netavark
version: netavark 1.11.0
...
DHCP Server is dnsmasq in all cases as far as I know
Are you able to setup package captures when that happens? It would be good to know if we actually get the dhcp packages in time. I would think so given it flakes in our CI and we tests only local so the chances for package loss are very low. I guess it is some bug in the DHCP client parser but I don't really have time to debug that at the moment unfortunately.
I got a minimal POC on this.
Steps to reproduce:
sudo dnf install podman
sudo systemctl enable --now netavark-dhcp-proxy.socket
# /etc/containers/systemd/macvlan.network
[Network]
NetworkName=macvlan
IPAMDriver=dhcp
Driver=macvlan
Options=parent=eth0
# /etc/containers/systemd/sample.container
[Unit]
Description=A sample container
[Container]
ContainerName=sample
Image=docker.io/library/nginx:latest
Network=macvlan.network
PublishPort=8080:80
[Install]
WantedBy=multi-user.target default.target
sudo systemctl daemon-reload
sudo systemctl start sample
sudo systemctl restart sample
Service restart should fail and logs show the bellow error:
Jun 30 08:14:25 test sample[2363]: Error: netavark: unable to obtain lease: dhcp proxy error: status: Internal, message: "InvalidDhcpServerReply: Failed to parse DHCP message from payload of pkg []: parser ran out of data-- not enough bytes", details: [], metadata: MetadataMap { headers: {"content-type": "application/grpc", "date": "Sun, 30 Jun 2024 08:14:25 GMT", "content-length": "0"} }
Issue can be reproduced easily from now on with quick subsequent service restarts using sudo systemctl restart sample; sleep 2; sudo systemctl restart sample
(maybe also wait a little before trying this since the last service start)
Some further observations on this after spending half a day on it:
Here, the green section is a successful dhcp dora process after a sudo systemctl restart sample
. Immediately after that, doing a second sudo systemctl restart sample
generates a new dhcp discover packet (highlighted in red) and immediately the error happens, even though that a few seconds after that (the same amount of seconds as the successful case) the relevant dhcp offer packet (highlighted in grey) arrives.
The error message can be traced down on the mozim
dhcp client library here:
https://github.com/nispor/mozim/blob/v0.2.3/src/msg.rs#L212
Due to the empty payload in the error message, this could originate from this recv buffer https://github.com/nispor/mozim/blob/v0.2.3/src/socket.rs#L128 being passed empty to the relevant dhcp decoder above
After enabling debug logs for netavark-dhcp-proxy service the error can be traced in mozim:
Jul 01 08:31:59 test netavark[2968]: [DEBUG mozim::event] Adding timer 8 seconds with event Timeout to event pool
Jul 01 08:31:59 test netavark[2968]: [DEBUG mozim::time] TimerFd created TimerFd { fd: OwnedFd { fd: 13 } } with 8 seconds
Jul 01 08:31:59 test netavark[2968]: [DEBUG mozim::event] Adding fd 13 to Epoll 12, event Timeout
Jul 01 08:31:59 test audit: ANOM_PROMISCUOUS dev=eth0 prom=256 old_prom=0 auid=4294967295 uid=0 gid=0 ses=4294967295
Jul 01 08:31:59 test kernel: virtio_net virtio1 eth0: entered promiscuous mode
Jul 01 08:31:59 test netavark[2968]: [DEBUG mozim::bpf] Registering BPF filter 0x28, 0, 0, 0x0000000c
Jul 01 08:31:59 test netavark[2968]: [DEBUG mozim::bpf] Registering BPF filter 0x15, 0, 8, 0x00000800
Jul 01 08:31:59 test netavark[2968]: [DEBUG mozim::bpf] Registering BPF filter 0x30, 0, 0, 0x00000017
Jul 01 08:31:59 test netavark[2968]: [DEBUG mozim::bpf] Registering BPF filter 0x15, 0, 6, 0x00000011
Jul 01 08:31:59 test netavark[2968]: [DEBUG mozim::bpf] Registering BPF filter 0x28, 0, 0, 0x00000014
Jul 01 08:31:59 test netavark[2968]: [DEBUG mozim::bpf] Registering BPF filter 0x45, 4, 0, 0x00001fff
Jul 01 08:31:59 test netavark[2968]: [DEBUG mozim::bpf] Registering BPF filter 0xb1, 0, 0, 0x0000000e
Jul 01 08:31:59 test netavark[2968]: [DEBUG mozim::bpf] Registering BPF filter 0x48, 0, 0, 0x00000010
Jul 01 08:31:59 test netavark[2968]: [DEBUG mozim::bpf] Registering BPF filter 0x15, 0, 1, 0x00000044
Jul 01 08:31:59 test netavark[2968]: [DEBUG mozim::bpf] Registering BPF filter 0x06, 0, 0, 0xffffffff
Jul 01 08:31:59 test netavark[2968]: [DEBUG mozim::bpf] Registering BPF filter 0x06, 0, 0, 0x00000000
Jul 01 08:31:59 test netavark[2968]: [DEBUG mozim::socket] Raw socket created 14
Jul 01 08:31:59 test netavark[2968]: [DEBUG mozim::event] Adding socket 14 with event RawPackageIn to event pool
Jul 01 08:31:59 test netavark[2968]: [DEBUG mozim::event] Adding fd 14 to Epoll 12, event RawPackageIn
Jul 01 08:31:59 test netavark[2968]: [DEBUG mozim::event] Adding timer 3 seconds with event DiscoveryTimeout to event pool
Jul 01 08:31:59 test netavark[2968]: [DEBUG mozim::time] TimerFd created TimerFd { fd: OwnedFd { fd: 15 } } with 3 seconds
Jul 01 08:31:59 test netavark[2968]: [DEBUG mozim::event] Adding fd 15 to Epoll 12, event DiscoveryTimeout
Jul 01 08:31:59 test audit: ANOM_PROMISCUOUS dev=eth0 prom=0 old_prom=256 auid=4294967295 uid=0 gid=0 ses=4294967295
Jul 01 08:31:59 test netavark[2968]: [DEBUG mozim::msg] DHCP message Message { opcode: BootRequest, htype: Eth, hlen: 6, hops: 0, xid: 1836246904, secs: 0, flags: Flags { broadcast: false }, ciaddr: 0.>
Jul 01 08:31:59 test netavark[2968]: [DEBUG mozim::socket] Sending raw ethernet package: [255, 255, 255, 255, 255, 255, 62, 98, 49, 211, 49, 89, 8, 0, 69, 0, 1, 25, 0, 0, 64, 0, 128, 17, 249, 212, 0, 0>
Jul 01 08:31:59 test netavark[2968]: [DEBUG mozim::socket] Raw socket sent: 295 bytes
Jul 01 08:31:59 test netavark[2968]: [DEBUG mozim::client] Processing event RawPackageIn
Jul 01 08:31:59 test netavark[2968]: [DEBUG mozim::socket] Raw socket receiving
Jul 01 08:31:59 test netavark[2968]: [DEBUG mozim::socket] Raw socket received [51, 51, 0, 0, 0, 22, 62, 98, 49, 211, 49, 89, 134, 221, 96, 0, 0, 0, 0, 36, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, >
Jul 01 08:31:59 test netavark[2968]: [ERROR mozim::msg] InvalidDhcpServerReply: Failed to parse DHCP message from payload of pkg []: parser ran out of data-- not enough bytes
Jul 01 08:31:59 test netavark[2968]: [DEBUG mozim::event] Removing fd 15 from Epoll 12
Jul 01 08:31:59 test netavark[2968]: [DEBUG mozim::event] Removing fd 13 from Epoll 12
Jul 01 08:31:59 test netavark[2968]: [DEBUG mozim::event] Removing fd 14 from Epoll 12
That I suppose comes from here: https://github.com/nispor/mozim/blob/v0.2.3/src/msg.rs#L209 https://github.com/nispor/mozim/blob/v0.2.3/src/msg.rs#L298 https://github.com/nispor/mozim/blob/v0.2.3/src/client.rs#L514
Thanks for the details, I don't have much time to run this down currently. Are you interested in working on a fix?
I'm still tracing it, I'll try to make a fix if I find the problem
After analyzing the payloads in the full mozim debug logs:
Jul 01 10:27:08 test netavark[3577]: [DEBUG mozim::msg] DHCP message Message { opcode: BootRequest, htype: Eth, hlen: 6, hops: 0, xid: 492450829, secs: 0, flags: Flags { broadcast: false }, ciaddr: 0.0.0.0, yiaddr: 0.0.0.0, siaddr: 0.0.0.0, giaddr: 0.0.0.0, chaddr: [118, 255, 172, 65, 53, 143, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], sname: None, fname: None, magic: [99, 130, 83, 99], opts: DhcpOptions({ClientIdentifier: ClientIdentifier([]), MessageType: MessageType(Discover), ParameterRequestList: ParameterRequestList([Hostname, SubnetMask, Router, DomainNameServer, DomainName, InterfaceMtu, NTPServers])}) }
Jul 01 10:27:08 test netavark[3577]: [DEBUG mozim::socket] Sending raw ethernet package: [255, 255, 255, 255, 255, 255, 118, 255, 172, 65, 53, 143, 8, 0, 69, 0, 1, 25, 0, 0, 64, 0, 128, 17, 249, 212, 0, 0, 0, 0, 255, 255, 255, 255, 0, 68, 0, 67, 1, 5, 55, 190, 1, 1, 6, 0, 29, 90, 52, 13, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 118, 255, 172, 65, 53, 143, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 99, 130, 83, 99, 53, 1, 1, 55, 7, 12, 1, 3, 6, 15, 26, 42, 255]
Jul 01 10:27:08 test netavark[3577]: [DEBUG mozim::socket] Raw socket sent: 295 bytes
Jul 01 10:27:08 test netavark[3577]: [DEBUG mozim::client] Processing event RawPackageIn
Jul 01 10:27:08 test netavark[3577]: [DEBUG mozim::socket] Raw socket receiving
Jul 01 10:27:08 test netavark[3577]: [DEBUG mozim::socket] Raw socket received [51, 51, 0, 0, 0, 22, 118, 255, 172, 65, 53, 143, 134, 221, 96, 0, 0, 0, 0, 36, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 255, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 22, 58, 0, 5, 2, 0, 0, 1, 0, 143, 0, 57, 186, 0, 0, 0, 1, 4, 0, 0, 0, 255, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 255, 65, 53, 143]
Jul 01 10:27:08 test netavark[3577]: [ERROR mozim::msg] InvalidDhcpServerReply: Failed to parse DHCP message from payload of pkg []: parser ran out of data-- not enough bytes
I converted the packet that the raw socket received starting with [51, 51...
to a hex dump, compared with nearby packets in the sequence and it seems to be an IPv6 Multicast Listener Report Message v2
!
In the following capture
The crash happens after netavark/mozim sends packet highlighted in red, because it expects packet highlighted in green, but recv
's first packet highlighted in yellow instead
I suppose the fix should be in mozim here: https://github.com/nispor/mozim/blob/v0.2.3/src/client.rs#L512 to not process the first packet it receives, but to inspect and check if it is a valid dhcp packet and skip it if it isn't
I suppose the fix should be in mozim here: https://github.com/nispor/mozim/blob/v0.2.3/src/client.rs#L512 to not process the first packet it receives, but to inspect and check if it is a valid dhcp packet and skip it if it isn't
That sounds logical
Created a PR in mozim that should fix the above: https://github.com/nispor/mozim/pull/33 As soon as it passes, I'll ask for a release to bump the dep here too and fix this
mozim 0.2.4 published. cargo update
should do the magic to pull it in.
@cathay4t Thank you!
I'm getting this behavior in a prod situation - the way it appears is that containers just won't start with that same error message. I have macvlan dhcp networks to a couple of real VLANs (also dnsmasq)
I also noticed that the dhcp-proxy process starts using an inappropriate amount of CPU after some period of time, and when it's in that state it will fail all requests consistently - if I kill the proxy then most (not all) containers will start again
I can consistently repro at least a couple of failures when I nuke and repave all of my containers on that box. I'll build and test in the next few days to see if this fixes it
thanks @agorgl!
I'm running rust-nightly release flavor now. I'm definitely suffering from #811 as there were 13708 threads created to manage ~15 container leases over 3 days.
I was able to restart all of my containers without any crashes after this patch (for the first time in netavark), so thanks again @agorgl!
I also have 1/10th the threads (~1300) after twice the time
new flake:
https://cirrus-ci.com/task/5066529801240576