Open lePereT opened 4 years ago
Thanks for the report, I've never touched urngd
but maybe @ynezz has a clue...
So, quickly typing a killall /sbin/urngd
after terminal access is gained appears to make urngd behave. Not ideal. Also what are the following error messages all about:
Failed to resize receive buffer: Operation not permitted
ip: RTNETLINK answers: Operation not permitted
...
ip: can't send flush request: Operation not permitted
ip: SIOCSIFFLAGS: Operation not permitted
Just to confirm that the problem persists with an Ubuntu 18.04 VM as host
Mem: 865520K used, 143284K free, 984K shrd, 34440K buff, 579772K cached
CPU: 99% usr 0% sys 0% nic 0% idle 0% io 0% irq 0% sirq
Load average: 0.39 0.11 0.04 4/154 711
PID PPID USER STAT VSZ %VSZ %CPU COMMAND
91 1 root R 776 0% 99% /sbin/urngd
444 1 root S 1208 0% 0% /sbin/netifd
1 0 root S 1176 0% 0% /sbin/procd
I can't reproduce the error, did you tried to reproduce it on other machines?
I can't reproduce that error even on Ubuntu 18.04 (but with 5.6.7 kernel). It would help to get strace output from urngd
if its in this state, should be as easy as running opkg update; opkg install strace; strace --no-abbrev --attach $(pidof urngd)
inside container spawn with docker run --cap-add SYS_PTRACE --rm -it openwrtorg/rootfs
i'll attempt to do this in the next week or so. i'll close the issue for now to prevent noise :) thanks for both your responses
I would like to reopen this issue.
I am running in the same bug when OpenWRT is running in a docker that does not allow ioctl RNDADDENTROPY on /dev/random.
This causes an infinite loop consuming high cpu because the WRITE poll event keeps triggering and is never satisfied (because it cannot), thus causing the infinite busy loop.
Should I provide a possible fix? I would simply stop the polling for a certain amount of time in case RNDADDENTROPY fails.
I have the same issue in Ubuntu18.04 VM, and OpenWRT(19.07.02) in the docker container.
@thg2k please provide a fix
@aparcar I did, but it was refused by the maintainer.
http://lists.openwrt.org/pipermail/openwrt-devel/2021-January/033587.html
It is indeed a very bad workaround but it solves the problem without causing any regression damage and it's easy to audit. A better fix would be to use uloop timers and improve logging but I have no interest in spending more time on this. It is still a fix and I recommend merging it.
I got this problem on my MT7621 router too, maybe there is something wrong with the source code.
I ran into this same problem when using PVE to run OpenWrt in Linux Container, according to random(4) - Linux manual page, The CAP_SYS_ADMIN capability is required for almost all related ioctl requests.
I had included the default OpenWrt config file (same as this lxc-template) which contains lxc.cap.drop = sys_admin
, I removed this line and the /sbin/urngd not stuck my CPU anymore.
I think there is also a way to grant the SYS_ADMIN capability to a Docker container, but it is overloaded so the decision is yours.
Moreover, it seems just uninstall the urngd
package could also solve this problem but I'm not sure the side effect.
I ran into this problem today on a Linksys WRT1900ACS which has an uptime of 248 days running
~# cat /etc/openwrt_release
DISTRIB_ID='OpenWrt'
DISTRIB_RELEASE='21.02.0'
DISTRIB_REVISION='r16279-5cc0535800'
DISTRIB_TARGET='mvebu/cortexa9'
DISTRIB_ARCH='arm_cortex-a9_vfpv3-d16'
DISTRIB_DESCRIPTION='OpenWrt 21.02.0 r16279-5cc0535800'
DISTRIB_TAINTS=''
Suddenly at around 1am my load jumped.
Killing urngd helped. But restarting it brought the load back up again. So, now I've killed urngd without restarting it. I will keep the system up to see if there are any impacts of having urngd stopped.
What, by the way, could be using urngd? Maybe those processes just need a restart. Perhaps dnsmasq? Anything else? Does OLSRd or babeld use urngd?
It looks like I am also seeing this on a TP-Link Archer C7 v2.
root@foobar:~# cat /etc/openwrt_release
DISTRIB_ID='OpenWrt'
DISTRIB_RELEASE='19.07.2'
DISTRIB_REVISION='r10947-65030d81f3'
DISTRIB_TARGET='ar71xx/generic'
DISTRIB_ARCH='mips_24kc'
DISTRIB_DESCRIPTION='OpenWrt 19.07.2 r10947-65030d81f3'
DISTRIB_TAINTS=''
Hi all, getting a lot of instability. On MacOS Mojave, running Docker version 19.03.8, and docker-machine version 0.16.2
If I just use the Readme command:
I get a number of error messages during launch:
When in the shell, it's sluggish, and I notice that one core of my CPU is being used at 100%. A
top
inside the container reveals the following:Am I doing something wrong?