Closed acheronfail closed 1 year ago
Just used strace
and I can see that messages are coming through, it's just that multicast.next()
never returns.
recvfrom(9, [
{
nlmsg_len=80,
nlmsg_type=RTM_NEWADDR,
nlmsg_flags=0,
nlmsg_seq=1688203532,
nlmsg_pid=8187
},
{
ifa_family=AF_INET,
ifa_prefixlen=32,
ifa_flags=IFA_F_PERMANENT,
ifa_scope=RT_SCOPE_UNIVERSE,
ifa_index=if_nametoindex("wlan0")
},
[
[
{
nla_len=8,
nla_type=IFA_ADDRESS
},
inet_addr("10.0.0.254")
],
[
{
nla_len=8,
nla_type=IFA_LOCAL
},
inet_addr("10.0.0.254")
],
[
{
nla_len=10,
nla_type=IFA_LABEL
},
"wlan0"
],
[
{
nla_len=8,
nla_type=IFA_FLAGS
},
IFA_F_PERMANENT
],
[
{
nla_len=20,
nla_type=IFA_CACHEINFO
},
{
ifa_prefered=4294967295,
ifa_valid=4294967295,
cstamp=38685,
tstamp=38685
}
]
]
], 32768, 0, NULL, NULL) = 80
Is this because I need a specific type before neli
will parse it and return it? If so, is there a way to always return any message? (Or alternatively, how do I find out the correct neli
type?)
I'll update this with any further findings...
Debugging Tips
RUST_LOG=trace
for more debug information about what neli
is doing.RUSTFLAGS=-g
so debug symbols are included and I can debug/step-through neli
with a debugger.Confusing things...
nl_pid = 0
so the message is always sent - I wonder why... is this just a generic netlink thing?Findings
My current findings are that the senders
collection here is always empty, and the message's pid
is non-zero, and so although neli
receives and parses the message, it doesn't send it back to the multicast
receiver in my example.
This is a combination of nl_pid != 0
and also nl_seq = <some very high number, like 1688206355>
.
So, for some reason - messages received here don't have nl_pid = 0
. This means that neli
doesn't forward those events to the multicast_receiver
, because it seems to only forward events with nl_pid = 0
to the multicast_receiver
. Any other message received on the socket is simply ignored and dropped.
Is this ignoring of events intended behaviour? :question:
I can confirm that No wait, I was confused! nl-monitor ipv4-ifaddr
also receives events on its equivalent of a multicast received with nl_pid != 0
and nl_seq = <random high number>
.strace -ff nl-monitor ipv4-ifaddr
shows that these messages are received with nl_pid == 0
- so something mustn't be set right with neli
...
I've changed the title - I think this should be updated to a feature
perhaps? (No permissions to change label...)
EDIT:
nl-monitor
calls something called nl_socket_disable_seq_check
- this disables the seq
check and sends all messages via the multicast
receiver. See their doc comments for more: https://github.com/thom311/libnl/blob/cbafad9ddf24caef5230fef715d34f0539603be0/lib/socket.c#L266-L277I created https://github.com/jbaublitz/neli/pull/219 in an attempt to fix this.
Alright, the whole thing is working (provided my fork that's in https://github.com/jbaublitz/neli/pull/219 is used).
Here's some sample code:
// setup socket for netlink route
let (socket, mut multicast) =
NlRouter::connect(NlFamily::Route, None, Groups::empty()).unwrap();
// add multicast membership for ipv4-addr updates
socket
.add_mcast_membership(Groups::new_groups(&[RTNLGRP_IPV4_IFADDR]))
.unwrap();
// listen for multicast events
// NOTE: currently requires the changes here: https://github.com/jbaublitz/neli/pull/219
type Next = Option<Result<Nlmsghdr<u16, Ifaddrmsg>, RouterError<u16, Ifaddrmsg>>>;
match multicast.next_typed::<u16, Ifaddrmsg>() as Next {
None => todo!(),
// we got a multicast message
Some(response) => {
// if there are errors on the multicast channel, they'll be here in this result
let response = response.unwrap();
// get message payload
let ifaddr_msg = response.get_payload().unwrap();
// get a handle to the message's rt attributes
let rt_attrs_handle = ifaddr_msg.rtattrs().get_attr_handle();
// get the address attribute
let addr_attr = rt_attrs_handle.get_attribute(Ifa::Address).unwrap();
// convert the raw bytes from the attribute into an `Ipv4Addr` struct
let bytes: &[u8] = addr_attr.rta_payload().as_ref();
let bytes: &[u8; 4] = bytes.try_into().unwrap();
let ipv4 = Ipv4Addr::from(*bytes);
// 🎉 we did it!
dbg!(ipv4);
}
}
I'm leaving this issue open as the tracking issue for ignored multicast events.
Wait - I'm so sorry for all the spam :sweat_smile: - after looking at this again I seem to have completely glossed over the fact that nl-monitor
receives events with nl_pid == 0
but when I try with neli
I get events with nl_pid > 0
!
So, the PR I created is probably bogus - these should be multicast events... but why don't the events come through with nl_pid == 0
when I subscribe with neli
??? This has got me so confused...
Again, this issue is now more or less a diary of my experience learning about netlink
:sweat_smile:.
I think I was originally right, actually. The thing that confused me is reading strace
's output of the recvmsg
calls:
recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=0x000010}, msg_namelen=12, msg_iov=[{iov_base=[{nlmsg_len=80, nlmsg_type=RTM_DELADDR, nlmsg_flags=0, nlmsg_seq=1688293504, nlmsg_pid=19010}, {ifa_family=AF_INET, ifa_prefixlen=32, ifa_flags=IFA_F_PERMANENT, ifa_scope=RT_SCOPE_UNIVERSE, ifa_index=if_nametoindex("wlan0")}, [[{nla_len=8, nla_type=IFA_ADDRESS}, inet_addr("10.0.0.254")], [{nla_len=8, nla_type=IFA_LOCAL}, inet_addr("10.0.0.254")], [{nla_len=10, nla_type=IFA_LABEL}, "wlan0"], [{nla_len=8, nla_type=IFA_FLAGS}, IFA_F_PERMANENT], [{nla_len=20, nla_type=IFA_CACHEINFO}, {ifa_prefered=4294967295, ifa_valid=4294967295, cstamp=142503, tstamp=142503}]]], iov_len=16384}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 80
I was confusing the first part of the recvmsg
call...
{msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=0x000010}
...thinking that contained the actual netlink
message header - but it doesn't! The actual header is this part:
{nlmsg_len=80, nlmsg_type=RTM_DELADDR, nlmsg_flags=0, nlmsg_seq=1688293504, nlmsg_pid=19010}
So, I believe my previous comments about multicast messages with nl_pid > 0
are correct.
@acheronfail Can you test #209 and let me know if that resolves the issue. Someone else suggested that I use recvfrom
instead of recv
to determine whether a message is coming from a netlink multicast group or not. Based on my initial testing, it seems to resolve the problem of heuristics. Can you please confirm that it resolves your issue too?
Ah yes! Thank you so much, I was going around in circle so many times :sweat_smile:
I can confirm that works for me!
I'm trying to make something that functions similar to
nl-monitor ipv4-ifaddr
withneli
, but I'm struggling to get anywhere with it.This is my current attempt:
As far as I can tell, this should be reporting events any time an ipv4 address on the machine changes, but I get no output at all with this setup. I've been looking at how
nl-monitor
works, and comparing that toneli
and they look very similar here, so I'm not sure what's different...Any chance you might know where I should look next? :pray: