Closed protectivedad closed 2 years ago
OpenWrt 21.02 or trunk? ;)
There were some important hostapd umdns fixes.
OpenWrt 21.02-SNAPSHOT r16312 Plus I cherry-picked the latest changes to DAWN.
Can you try OpenWrt Snapshot and not 21.02? We had a lot of changes and buf fixes for hostapd.
Oh, you're killing me :) They just released 21.02.1 and I was hoping to get away from bleeding edge :). It will take me a while I have my own hardware changes that I need to merge in if I move over. I'll monitor the memory usage as I work to switch over to master.
Oh, you're killing me :) They just released 21.02.1 and I was hoping to get away from bleeding edge :). It will take me a while I have my own hardware changes that I need to merge in if I move over. I'll monitor the memory usage as I work to switch over to master.
Maybe we can also migrate hostapd changes to 21.02. ;)
GIT never ceases to amaze me (I'm a old school Atari programmer). With a few commands I could get all my commits across 21.02 and cherry-pick them on a fresh openwrt master and run my build scripts. I'll update the firmwares and watch them for memory growth.
I've been away and may be reading things wrongly, but...
it looks like DAWN_MEMORY_AUDIT is defined by default, so a kill -HUP to the process should put some details of memory that DAWN has allocated into the system log. If that info is getting longer as memory is consumed then DAWN may be at fault, but if it staying stable then it could well be a different library that is leaking memory.
... so a kill -HUP to the process should put some details of memory that DAWN has allocated into the system log.
Just got DAWN up and running again to check this. It's not quite right as the memory logging goes to stdout. You can make stdout appear in logread if you edit the obvious line in /etc/init.d/dawn
procd_set_param stdout 1
Example after DAWN has been running for a week or so on my router. The NNN= values are a sequential indicator that helps show what is a new or old allocation.
root@OpenWrt:~# ps|grep dawn
3773 root 2688 S /tmp/dawn
root@OpenWrt:~# kill -HUP 3773
root@OpenWrt:~# logread | grep dawn
Mon Dec 20 23:17:32 2021 daemon.info dawn[3773]: mem-audit: Currently recorded allocations...
Mon Dec 20 23:17:32 2021 daemon.info dawn[3773]: mem-audit: 3=X - ubus.c@589: 0
Mon Dec 20 23:17:32 2021 daemon.info dawn[3773]: mem-audit: 1125075=M - msghandler.c@150: 96
Mon Dec 20 23:17:32 2021 daemon.info dawn[3773]: mem-audit: 4=C - ubus.c@1320: 744
Mon Dec 20 23:17:32 2021 daemon.info dawn[3773]: mem-audit: 1124391=M - msghandler.c@150: 96
Mon Dec 20 23:17:32 2021 daemon.info dawn[3773]: mem-audit: 1125087=M - msghandler.c@150: 96
Mon Dec 20 23:17:32 2021 daemon.info dawn[3773]: mem-audit: 1125085=M - msghandler.c@475: 608
Mon Dec 20 23:17:32 2021 daemon.info dawn[3773]: mem-audit: 1125105=M - msghandler.c@150: 96
Mon Dec 20 23:17:32 2021 daemon.info dawn[3773]: mem-audit: 0=X - dawn_uci.c@417: 0
Mon Dec 20 23:17:32 2021 daemon.info dawn[3773]: mem-audit: 1=X - dawn_uci.c@436: 0
Mon Dec 20 23:17:32 2021 daemon.info dawn[3773]: mem-audit: [End of list: 1736 bytes]
@protectivedad - Is this resolved now? If not I'll try and repro it to fix.
Sorry for the long delay. I was never able to switch to the latest master as discussed. When I attempted it the dnsmasq stop working, and other things started breaking. To the point where my wife and kids were very upset :). I updated 21.02 on Nov 27 which captured a lot of hostapd fixes. Since Nov 27 I haven't had any issues. I have changed the stdout param and will continue to monitor. Thanks for getting back to me and again sorry for the long delay on responding,
Seems to be fine now. The process isn't being killed and all my routers DAWN VSZ are all the abouth the same. This is with the latest DAWN and openwrt-21.02 from 01-23-2022. Thanks for taking the time.
I have four routers all different brands/memory running DAWN. Also of note they all have ZRAM. Yesterday I updated all of them to the latest version of DAWN and reset all the configs. Since then two of the routers (one has 128 MB and the other 64 MB of RAM) have run into problems. The router runs fine for hours but after a while it becomes sluggish and will starts killing process with OOM (dnsmasq, openvpn, DAWN, etc). The first time I had to force reboot -f. The second time I wanted to find the problem application. Since DAWN was the last thing I updated I tried /etc/init.d/dawn restart, and voila the problem went away.
Before restart PS showed:
after
The thing I notice DAWN's VSZ goes from 8384 to 2756. Unfortunately, I didn't get any other info. My other routers have DAWN's VSZ at 2856, 5604, and 10788. They all run different things and have different memory sizes.
I did have a router kill DAWN with an OOM before I updated. That was the reason I updated, I wanted DAWN to restart. But I had DAWN running for months without issue. Though if it was killing DAWN I'm not sure I would have noticed.
What can I do to narrow down what is using up all the memory? If it is DAWN, do you think that it is inherent and DAWN will require lots of memory or is there a memory leak/problem?
Thanks.