nispor / mozim

DHCP Client Daemon
Apache License 2.0
11 stars 9 forks source link

Error when receiving IPv6 multicast packets in raw socket recv #32

Closed agorgl closed 4 months ago

agorgl commented 4 months ago

After a dchp (v4) discover packet, when the raw socket recv here: https://github.com/nispor/mozim/blob/v0.2.3/src/client.rs#L512 expects a dhcp offer packet but instead receives an IPv6 multicast packet, the following error occurs:

InvalidDhcpServerReply: Failed to parse DHCP message from payload of pkg []: parser ran out of data-- not enough bytes

This race condition is pretty frequent depending on the existence of such packets. A more thorough analysis can be found here: https://github.com/containers/netavark/issues/618

cathay4t commented 4 months ago

Investigation required to understand why this packet passed the BPF filter.

0x33 0x33 0x00 0x00 0x00 0x16 0x76 0xff 
0xac 0x41 0x35 0x8f 0x86 0xdd 0x60 0x00 
0x00 0x00 0x00 0x24 0x00 0x01 0x00 0x00 
0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 
0x00 0x00 0x00 0x00 0x00 0x00 0xff 0x02 
0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 
0x00 0x00 0x00 0x00 0x00 0x16 0x3a 0x00 
0x05 0x02 0x00 0x00 0x01 0x00 0x8f 0x00 
0x39 0xba 0x00 0x00 0x00 0x01 0x04 0x00 
0x00 0x00 0xff 0x02 0x00 0x00 0x00 0x00 
0x00 0x00 0x00 0x00 0x00 0x01 0xff 0x41 
0x35 0x8f
cathay4t commented 4 months ago

@agorgl Could you share the tcpdump/wireshark captured data?

cathay4t commented 4 months ago

Above packet has 0x86dd(IPv6), it should dropped by BPF (BPF_JMP | BPF_JEQ | BPF_K, 0, 8, ETHERTYPE_IP).

Pending more investigation.

agorgl commented 4 months ago

The problem can be reproduced in some boots consistently, but it seems to be impossible to reproduce in others even with looots of retries. I'm thinking that the problem isn't the packet payload per-se, but maybe at some mozim initializations the bpf filter is not applied at all?

cathay4t commented 4 months ago

Aha, the packet is cached by kernel before we applied BPF filter. Let me prepare another patch.

cathay4t commented 4 months ago

@agorgl Can you try again with https://github.com/nispor/mozim/pull/35 ?

agorgl commented 4 months ago

I'll try to repeat it multiple times and check if Ignoring invalid DHCP package shows up, I'll report back soon

agorgl commented 4 months ago

I tried this:

for i in `seq 1 10`; do
    echo $(date): ROUND $i
    sudo systemctl restart netavark-dhcp-proxy.{socket,service}
    sleep 1
    for j in `seq 1 10`; do
        sleep 3
        echo $(date): retry $j
        sudo systemctl reset-failed sample
        sudo systemctl restart sample
        [ $? -ne 0 ] && break
    done
done

across 3 VM reboots, and no Ignoring invalid DHCP package showed up, so I believe with #35 we are good to go

cathay4t commented 4 months ago

mozim 0.2.4 published. cargo update should do the magic to pull it in.