Closed rkuhn closed 2 years ago
Here’s one of those poll invocations with TRACE logging:
2022-03-24T15:40:47.368981Z TRACE netlink_proto::connection: polling Connection
2022-03-24T15:40:47.368988Z DEBUG netlink_proto::connection: reading incoming messages
2022-03-24T15:40:47.368995Z TRACE netlink_proto::connection: poll_read_messages called
2022-03-24T15:40:47.369002Z TRACE netlink_proto::connection: polling socket
2022-03-24T15:40:47.369008Z DEBUG netlink_proto::codecs: NetlinkCodec: decoding next message
2022-03-24T15:40:47.369015Z TRACE netlink_proto::codecs: buffer is empty
2022-03-24T15:40:47.369023Z TRACE netlink_proto::connection: no datagram read from socket
2022-03-24T15:40:47.369030Z DEBUG netlink_proto::connection: forwarding unsolicited messages to the connection handle
2022-03-24T15:40:47.369038Z TRACE netlink_proto::connection: forward_unsolicited_messages called
2022-03-24T15:40:47.369044Z TRACE netlink_proto::connection: forward_unsolicited_messages done
2022-03-24T15:40:47.369052Z DEBUG netlink_proto::connection: forwaring responses to previous requests to the connection handle
2022-03-24T15:40:47.369059Z TRACE netlink_proto::connection: forward_responses called
2022-03-24T15:40:47.369066Z TRACE netlink_proto::connection: forward_responses done
2022-03-24T15:40:47.369073Z DEBUG netlink_proto::connection: handling requests
2022-03-24T15:40:47.369079Z TRACE netlink_proto::connection: poll_requests called
2022-03-24T15:40:47.369086Z DEBUG netlink_proto::connection: sending messages
2022-03-24T15:40:47.369093Z TRACE netlink_proto::connection: poll_send_messages called
2022-03-24T15:40:47.369101Z TRACE netlink_proto::connection: poll_send_messages done
2022-03-24T15:40:47.369107Z TRACE netlink_proto::connection: poll_flush called
2022-03-24T15:40:47.369115Z TRACE netlink_proto::connection: done polling Connection
Further digging:
@mxinden? @dvc94ch?
I’ve narrowed it down to the libp2p update 0.41.1 → 0.42.2.
does look like something is wrong. I guess first thing would be reproducing it using just if-watch using the example and then probably file a proper bug report here once the exact cause has been identified. The if-watch code for linux is <100 loc.
if-watch example doesn’t get into the infinite poll loop state, but it does this:
[2022-03-25T11:26:10Z DEBUG if_watch::linux] returning event Down(10.0.0.1/32)
Got event Ok(Down(10.0.0.1/32))
[2022-03-25T11:26:10Z DEBUG if_watch::linux] polling IfWatcher 0x7ffcacf17730
[2022-03-25T11:26:10Z TRACE netlink_proto::connection] polling Connection
[2022-03-25T11:26:10Z DEBUG netlink_proto::connection] reading incoming messages
[2022-03-25T11:26:10Z TRACE netlink_proto::connection] poll_read_messages called
[2022-03-25T11:26:10Z TRACE netlink_proto::connection] polling socket
[2022-03-25T11:26:10Z DEBUG netlink_proto::codecs] NetlinkCodec: decoding next message
[2022-03-25T11:26:10Z TRACE netlink_proto::codecs] buffer is empty
[2022-03-25T11:26:10Z TRACE netlink_sys::smol] poll_recv_from: Err(Os { code: 11, kind: WouldBlock, message: "Resource temporarily unavailable" })
[2022-03-25T11:26:10Z TRACE netlink_proto::connection] no datagram read from socket
[2022-03-25T11:26:10Z DEBUG netlink_proto::connection] forwarding unsolicited messages to the connection handle
[2022-03-25T11:26:10Z TRACE netlink_proto::connection] forward_unsolicited_messages called
[2022-03-25T11:26:10Z TRACE netlink_proto::connection] forward_unsolicited_messages done
[2022-03-25T11:26:10Z DEBUG netlink_proto::connection] forwaring responses to previous requests to the connection handle
[2022-03-25T11:26:10Z TRACE netlink_proto::connection] forward_responses called
[2022-03-25T11:26:10Z TRACE netlink_proto::connection] forward_responses done
[2022-03-25T11:26:10Z DEBUG netlink_proto::connection] handling requests
[2022-03-25T11:26:10Z TRACE netlink_proto::connection] poll_requests called
[2022-03-25T11:26:10Z DEBUG netlink_proto::connection] sending messages
[2022-03-25T11:26:10Z TRACE netlink_proto::connection] poll_send_messages called
[2022-03-25T11:26:10Z TRACE netlink_proto::connection] poll_send_messages done
[2022-03-25T11:26:10Z TRACE netlink_proto::connection] poll_flush called
[2022-03-25T11:26:11Z TRACE netlink_proto::connection] done polling Connection
[2022-03-25T11:26:11Z DEBUG if_watch::linux] polling IfWatcher 0x7ffcacf17730
[2022-03-25T11:26:11Z TRACE netlink_proto::connection] polling Connection
[2022-03-25T11:26:11Z DEBUG netlink_proto::connection] reading incoming messages
[2022-03-25T11:26:11Z TRACE netlink_proto::connection] poll_read_messages called
[2022-03-25T11:26:11Z TRACE netlink_proto::connection] polling socket
[2022-03-25T11:26:11Z DEBUG netlink_proto::codecs] NetlinkCodec: decoding next message
[2022-03-25T11:26:11Z TRACE netlink_proto::codecs] buffer is empty
[2022-03-25T11:26:11Z TRACE netlink_sys::smol] poll_recv_from: Err(Os { code: 11, kind: WouldBlock, message: "Resource temporarily unavailable" })
[2022-03-25T11:26:11Z TRACE netlink_proto::connection] no datagram read from socket
[2022-03-25T11:26:11Z DEBUG netlink_proto::connection] forwarding unsolicited messages to the connection handle
[2022-03-25T11:26:11Z TRACE netlink_proto::connection] forward_unsolicited_messages called
[2022-03-25T11:26:11Z TRACE netlink_proto::connection] forward_unsolicited_messages done
[2022-03-25T11:26:11Z DEBUG netlink_proto::connection] forwaring responses to previous requests to the connection handle
[2022-03-25T11:26:11Z TRACE netlink_proto::connection] forward_responses called
[2022-03-25T11:26:11Z TRACE netlink_proto::connection] forward_responses done
[2022-03-25T11:26:11Z DEBUG netlink_proto::connection] handling requests
[2022-03-25T11:26:11Z TRACE netlink_proto::connection] poll_requests called
[2022-03-25T11:26:11Z DEBUG netlink_proto::connection] sending messages
[2022-03-25T11:26:11Z TRACE netlink_proto::connection] poll_send_messages called
[2022-03-25T11:26:11Z TRACE netlink_proto::connection] poll_send_messages done
[2022-03-25T11:26:11Z TRACE netlink_proto::connection] poll_flush called
[2022-03-25T11:26:11Z TRACE netlink_proto::connection] done polling Connection
[2022-03-25T11:26:11Z DEBUG if_watch::linux] polling IfWatcher 0x7ffcacf17730
[2022-03-25T11:26:11Z TRACE netlink_proto::connection] polling Connection
[2022-03-25T11:26:11Z DEBUG netlink_proto::connection] reading incoming messages
[2022-03-25T11:26:11Z TRACE netlink_proto::connection] poll_read_messages called
[2022-03-25T11:26:11Z TRACE netlink_proto::connection] polling socket
[2022-03-25T11:26:11Z DEBUG netlink_proto::codecs] NetlinkCodec: decoding next message
[2022-03-25T11:26:11Z TRACE netlink_proto::codecs] buffer is empty
[2022-03-25T11:26:11Z TRACE netlink_sys::smol] poll_recv_from: Err(Os { code: 11, kind: WouldBlock, message: "Resource temporarily unavailable" })
[2022-03-25T11:26:11Z TRACE netlink_proto::connection] no datagram read from socket
[2022-03-25T11:26:11Z DEBUG netlink_proto::connection] forwarding unsolicited messages to the connection handle
[2022-03-25T11:26:11Z TRACE netlink_proto::connection] forward_unsolicited_messages called
[2022-03-25T11:26:11Z TRACE netlink_proto::connection] forward_unsolicited_messages done
[2022-03-25T11:26:11Z DEBUG netlink_proto::connection] forwaring responses to previous requests to the connection handle
[2022-03-25T11:26:11Z TRACE netlink_proto::connection] forward_responses called
[2022-03-25T11:26:11Z TRACE netlink_proto::connection] forward_responses done
[2022-03-25T11:26:11Z DEBUG netlink_proto::connection] handling requests
[2022-03-25T11:26:11Z TRACE netlink_proto::connection] poll_requests called
[2022-03-25T11:26:11Z DEBUG netlink_proto::connection] sending messages
[2022-03-25T11:26:11Z TRACE netlink_proto::connection] poll_send_messages called
[2022-03-25T11:26:11Z TRACE netlink_proto::connection] poll_send_messages done
[2022-03-25T11:26:11Z TRACE netlink_proto::connection] poll_flush called
[2022-03-25T11:26:11Z TRACE netlink_proto::connection] done polling Connection
I can’t see yet why the IfWatcher is polled thrice, even though it clearly did return Pending.
Debugger shows that futures-executor says after first Pending that the thread was notified and polls again, then after the next Pending thread::park()
immediately returns, then after the third Pending thread::park()
actually parks the thread. It seems that something is invoking the Waker.
Further investigation shows that the Waker is not invoked outside the actual netlink responses, so this is very likely a bug in some other crate, closing here. (smoking gun: as I read it, a single wakeup in futures-executor will usually poll at least twice)
Since a recent libp2p update that introduced the rtnetlink dependency (via
if-watch
1.0.0, updated from 0.2.2), there has been very high rate DEBUG logging generated (at more than 10kHz frequency):Is this indicative of a problem? Would it be unreasonable to ask such frequent logging to be relegated to TRACE level?