Closed zwgraham closed 4 years ago
one of the cores is pegged at 100%
This sounds like it somehow stuck inside softirq... maybe some result in a endless loop. :-/
https://www.spinics.net/lists/linux-wpan/msg05331.html
Please look if this fix your issue. Thanks and sorry.
@zwgraham it would be great if you could confirm the fix from Alex. I applied it to the wpan tree by now in case you prefer fetching it from a git repo (https://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan.git/)
I close this as it solved for me.
I'm testing the current state of wpan-tools prior to my team beginning some work.
Today, I was testing the 4.19 kernel with the latest wpan-tools and the mac802154_hwsim kernel module. Most everything worked with wpan-hwsim except
wpan-hwsim edge del a b
andwpan-hwsim edge lqi a b num
Please let me know if this is expected or if there's something I can do to better debug this issue. I'd love to get to the point where I'm submitting patches and adding some device support in the near future
Setup to repeat the issue
From here you can issue a command to either delete an edge or change a LQI and my virtual machine will freeze. My ssh connection drops, one of the cores is pegged at 100% utilization, and the hypervisor console also won't respond.
What I looked at while troubleshooting
I fired-up GDB to see if I can pick out a problem and took a look at the debug output for libnl3.
GDB
This call in hwsim_cmd_edge(), seems to be where the trouble occurs
rc = nl_send_auto(nl_sock, msg);
LIBNL3 Debug
libnl3 debug output for delete edge command
The last line is sshd dropping my connection.
relevant kconfig options
Since I'm using vanilla 4.19, I figure I should share the relevant Kconfig option's I've enabled (obviously the 6lowpan stuff is irrelevant for this particular issue, but I figure I'd include them for completeness.