Closed 4np closed 3 years ago
If you suspect a memory leak, you should keep a process / memory viewer on your screen, and follow the memory usage. As soon as it stops, make a picture of it, and post it here.
The local variable 'msg' referenced before assignment
is fixed in 3.1.0 Final, it was present in the non-stable builds.
It would indeed be helpful to also enable +Debug
logging in the Status and Interface settings window to maybe get a sense of what it is doing when it crashes.
A memory leaking was fixed in the develop
branch, although this should not fill up 16GB.
The difficulty is that the MacMini is headless, so I can only SSH or VNC into it and thus cannot snap a picture of it. Perhaps I can create a screenshotting app that creates screenshots at intervals to see what the memory usage does 🤔 Or perhaps just run a cronjob that will write some stats to a logfile or something.
Another thought was running SABnzbd+
inside a Docker container to sandbox it a little better so it does not pull the whole OS down.
I'll enable the Debug logging to see if it will provide some more insight 👍
I can only SSH or VNC into it and thus cannot snap a picture of it
Welll, I SSH to my headless Linux, and can start htop and keep it running to monitor memory usage:
So SAB is running, with 1% memory usage (of the 4GB), 40 MB.
Yeah true, but that does not help if the OS hangs and becomes completely unresponsive. There is no way to SSH into it then ;)
I was trying to see if running SABnzbd
in a Docker container would help, but apparently there is an open issue where Docker for Mac is showing slow network throughput (I see about 1/5th of network speed when using the Docker container). So it seems like switching to a containerized SABnzbd
is not a way forward, at least for now.
Yeah true, but that does not help if the OS hangs and becomes completely unresponsive. There is no way to SSH into it then ;)
No, then you're too late. So: keep ssh and htop open, check every now and then the memory usage of SABnzbd ... until your system freezes.
FWIW: my thought ... not a SAB problem. Probaby OS or HW problem. But your htop will show ...
I can send you a test build if develop, see how it does. But indeed, I also wonder if it can really be just sabnzbd freezing a whole system.
Can you try the release here and enable Debug logging? It both has some changes to memory management and the way restarts are handled. It works on my Mac using the regular restart or using the scheduler, but then the current 3.1.0 also worked on my Mac.. But who knows. https://github.com/sabnzbd/sabbuild/releases/tag/macos-test This release is not notarised by Apple, so you might get a security warning when starting it the first time.
@4np Any luck to test it?
Yeah it's running now. I tried a scheduled restart and that seems to work fine now.
While keeping an eye on memory consumption it seems to remain around 800MB or so, sometimes going up while downloading but going back down again when idle.
I didn't yet experience the 'hang', that generally takes a couple of days and does not manifest itself in a couple of hours or so. I'll disable the scheduled restart and see what happens.
Any more hangs?
Yeah I experienced a hang, but nothing in the log file. Unfortunately I am still not sure what is causing it. I'm going to give it another go with a couple of logs tailing in screen sessions to see if I can find out what is causing it. Let's see what that does. When I can't find anything it's probably best to close this.
It happened again, this is what I saw in my disconnected screen sessions:
SABnzbd+
was at 99.3% CPU at the time of disconnect (memory at just 6.2%):
a little bit further down it looks like it was extracting something:
8435 someuser 24 10 4187M 2472 ? 0.0 0.0 0:00.33 /Applications/SABnzbd.app/Contents/MacOS/osx/unrar/unrar e -vp -idp -o+ -p- /path/to/somefile
No clear pointer to what is happening, but it certainly looks like the OS seems to be killing applications, and my SSH sessions from above:
Nov 12 22:58:26 MacMini com.apple.xpc.launchd[1] (com.apple.mdworker.shared.05000000-0500-0000-0000-000000000000[2882]): Service exited due to SIGKILL | sent by mds[127]
Nov 12 22:58:26 MacMini com.apple.xpc.launchd[1] (com.apple.mdworker.shared.0D000000-0000-0000-0000-000000000000[2885]): Service exited due to SIGKILL | sent by mds[127]
Nov 12 22:58:26 MacMini com.apple.xpc.launchd[1] (com.apple.mdworker.shared.0A000000-0100-0000-0000-000000000000[2883]): Service exited due to SIGKILL | sent by mds[127]
Nov 12 22:58:26 MacMini com.apple.xpc.launchd[1] (com.apple.mdworker.shared.0C000000-0200-0000-0000-000000000000[2884]): Service exited due to SIGKILL | sent by mds[127]
Nov 12 22:58:26 MacMini com.apple.xpc.launchd[1] (com.apple.mdworker.shared.03000000-0000-0000-0000-000000000000[2886]): Service exited due to SIGKILL | sent by mds[127]
Nov 12 23:01:08 MacMini com.apple.xpc.launchd[1] (com.apple.mdworker.shared.0C000000-0300-0000-0000-000000000000[2891]): Service exited due to SIGKILL | sent by mds[127]
Nov 12 23:01:17 MacMini sshd: someuser [priv][2676]: DEAD_PROCESS: 2678 ttys003
Nov 12 23:01:17 MacMini com.apple.xpc.launchd[1] (com.openssh.sshd.CA5D73C2-076A-4037-A04A-D405F1616C23[2676]): Service exited with abnormal code: 255
Nov 12 23:01:18 MacMini sshd: someuser [priv][2899]: USER_PROCESS: 2901 ttys003
Nov 12 23:01:24 MacMini sshd: someuser [priv][2899]: DEAD_PROCESS: 2901 ttys003
Nov 12 23:01:24 MacMini com.apple.xpc.launchd[1] (com.openssh.sshd.6AD5E42C-770B-4C0D-9663-9638E39D9529[2899]): Service exited with abnormal code: 255
Nov 12 23:01:56 MacMini com.apple.xpc.launchd[1] (com.apple.mdworker.shared.0C000000-0400-0000-0000-000000000000[2896]): Service exited due to SIGKILL | sent by mds[127]
Nov 12 23:01:56 MacMini com.apple.xpc.launchd[1] (com.apple.mdworker.shared.05000000-0600-0000-0000-000000000000[2897]): Service exited due to SIGKILL | sent by mds[127]
Nov 12 23:01:56 MacMini com.apple.xpc.launchd[1] (com.apple.mdworker.shared.0D000000-0100-0000-0000-000000000000[2895]): Service exited due to SIGKILL | sent by mds[127]
Nov 12 23:01:56 MacMini com.apple.xpc.launchd[1] (com.apple.mdworker.shared.03000000-0100-0000-0000-000000000000[2892]): Service exited due to SIGKILL | sent by mds[127]
Nov 12 23:01:56 MacMini com.apple.xpc.launchd[1] (com.apple.mdworker.shared.0F000000-0500-0000-0000-000000000000[2894]): Service exited due to SIGKILL | sent by mds[127]
Nov 12 23:01:56 MacMini com.apple.xpc.launchd[1] (com.apple.mdworker.shared.0A000000-0200-0000-0000-000000000000[2893]): Service exited due to SIGKILL | sent by mds[127]
Nov 12 23:04:40 MacMini com.apple.xpc.launchd[1] (com.apple.mdworker.shared.0F000000-0600-0000-0000-000000000000[3103]): Service exited due to SIGKILL | sent by mds[127]
Nov 12 23:05:58 MacMini com.apple.xpc.launchd[1] (com.apple.mdworker.shared.02000000-0000-0000-0000-000000000000[3104]): Service exited due to SIGKILL | sent by mds[127]
Nov 12 23:05:58 MacMini syslogd[101]: ASL Sender Statistics
Nov 12 23:06:49 MacMini syncdefaultsd[3106]: objc[3106]: Class SYDClient is implemented in both /System/Library/PrivateFrameworks/SyncedDefaults.framework/Versions/A/SyncedDefaults and /System/Library/PrivateFrameworks/SyncedDefaults.framework/Support/syncdefaultsd. One of the two will be used. Which one is undefined.
Nov 12 23:06:49 MacMini syncdefaultsd[3106]: objc[3106]: Class SYDJournal is implemented in both /System/Library/PrivateFrameworks/SyncedDefaults.framework/Versions/A/SyncedDefaults and /System/Library/PrivateFrameworks/SyncedDefaults.framework/Support/syncdefaultsd. One of the two will be used. Which one is undefined.
Nov 12 23:07:35 MacMini com.apple.xpc.launchd[1] (com.apple.mdworker.shared.0F000000-0700-0000-0000-000000000000[3105]): Service exited due to SIGKILL | sent by mds[127]
Nov 12 23:07:35 MacMini com.apple.xpc.launchd[1] (com.apple.mdworker.shared.0A000000-0300-0000-0000-000000000000[3108]): Service exited due to SIGKILL | sent by mds[127]
Nov 12 23:12:20 MacMini syncdefaultsd[3110]: objc[3110]: Class SYDClient is implemented in both /System/Library/PrivateFrameworks/SyncedDefaults.framework/Versions/A/SyncedDefaults and /System/Library/PrivateFrameworks/SyncedDefaults.framework/Support/syncdefaultsd. One of the two will be used. Which one is undefined.
Nov 12 23:12:20 MacMini syncdefaultsd[3110]: objc[3110]: Class SYDJournal is implemented in both /System/Library/PrivateFrameworks/SyncedDefaults.framework/Versions/A/SyncedDefaults and /System/Library/PrivateFrameworks/SyncedDefaults.framework/Support/syncdefaultsd. One of the two will be used. Which one is undefined.
Nov 12 23:22:13 MacMini com.apple.xpc.launchd[1] (com.apple.xpc.launchd.domain.pid.backupd.3119): Failed to bootstrap path: path = /System/Library/CoreServices/backupd.bundle/Contents/Resources/backupd, error = 2: No such file or directory
Nov 12 23:22:13 MacMini syslogd[101]: ASL Sender Statistics
Nov 12 23:23:10 MacMini com.apple.xpc.launchd[1] (com.apple.mdworker.shared.02000000-0100-0000-0000-000000000000[3120]): Service exited due to SIGKILL | sent by mds[127]
Nov 12 23:31:53 MacMini com.apple.xpc.launchd[1] (com.apple.AddressBook.abd): Service only ran for 9 seconds. Pushing respawn out by 1 seconds.
Nov 12 23:31:58 MacMini com.apple.xpc.launchd[1] (com.apple.mdworker.shared.0F000000-0000-0000-0000-000000000000[3124]): Service exited due to SIGKILL | sent by mds[127]
Nov 12 23:31:58 MacMini com.apple.xpc.launchd[1] (com.apple.mdworker.shared.03000000-0200-0000-0000-000000000000[3125]): Service exited due to SIGKILL | sent by mds[127]
Nov 12 23:31:58 MacMini com.apple.xpc.launchd[1] (com.apple.mdworker.shared.05000000-0700-0000-0000-000000000000[3126]): Service exited due to SIGKILL | sent by mds[127]
Nov 12 23:32:23 MacMini com.apple.xpc.launchd[1] (com.apple.mdworker.shared.07000000-0200-0000-0000-000000000000[3131]): Service exited due to SIGKILL | sent by mds[127]
Nov 12 23:32:23 MacMini syslogd[101]: ASL Sender Statistics
Nov 12 23:32:38 MacMini com.apple.xpc.launchd[1] (com.apple.mdworker.shared.09000000-0300-0000-0000-000000000000[3130]): Service exited due to SIGKILL | sent by mds[127]
Nov 12 23:32:54 MacMini com.apple.xpc.launchd[1] (com.apple.mdworker.shared.0A000000-0400-0000-0000-000000000000[3123]): Service exited due to SIGKILL | sent by mds[127]
Nov 12 23:41:20 MacMini syncdefaultsd[3136]: objc[3136]: Class SYDClient is implemented in both /System/Library/PrivateFrameworks/SyncedDefaults.framework/Versions/A/SyncedDefaults and /System/Library/PrivateFrameworks/SyncedDefaults.framework/Support/syncdefaultsd. One of the two will be used. Which one is undefined.
Nov 12 23:41:20 MacMini syncdefaultsd[3136]: objc[3136]: Class SYDJournal is implemented in both /System/Library/PrivateFrameworks/SyncedDefaults.framework/Versions/A/SyncedDefaults and /System/Library/PrivateFrameworks/SyncedDefaults.framework/Support/syncdefaultsd. One of the two will be used. Which one is undefined.
Nov 12 23:45:24 MacMini syslogd[101]: ASL Sender Statistics
Nov 12 23:46:32 MacMini com.apple.xpc.launchd[1] (com.apple.mdworker.shared.03000000-0300-0000-0000-000000000000[3139]): Service exited due to SIGKILL | sent by mds[127]
Nov 12 23:46:32 MacMini com.apple.xpc.launchd[1] (com.apple.mdworker.shared.0D000000-0200-0000-0000-000000000000[3141]): Service exited due to SIGKILL | sent by mds[127]
Nov 12 23:46:32 MacMini com.apple.xpc.launchd[1] (com.apple.mdworker.shared.05000000-0000-0000-0000-000000000000[3137]): Service exited due to SIGKILL | sent by mds[127]
Nov 12 23:46:32 MacMini com.apple.xpc.launchd[1] (com.apple.mdworker.shared.01000000-0300-0000-0000-000000000000[3142]): Service exited due to SIGKILL | sent by mds[127]
Nov 12 23:46:32 MacMini com.apple.xpc.launchd[1] (com.apple.mdworker.shared.0A000000-0500-0000-0000-000000000000[3138]): Service exited due to SIGKILL | sent by mds[127]
Nov 12 23:46:32 MacMini com.apple.xpc.launchd[1] (com.apple.mdworker.shared.0F000000-0100-0000-0000-000000000000[3140]): Service exited due to SIGKILL | sent by mds[127]
Nov 12 23:47:06 MacMini com.apple.xpc.launchd[1] (com.apple.mdworker.shared.10000000-0000-0000-0000-000000000000[3145]): Service exited due to SIGKILL | sent by mds[127]
Nov 12 23:47:17 MacMini com.apple.xpc.launchd[1] (com.apple.mdworker.shared.07000000-0300-0000-0000-000000000000[3146]): Service exited due to SIGKILL | sent by mds[127]
Nov 12 23:47:17 MacMini com.apple.xpc.launchd[1] (com.apple.mdworker.shared.0C000000-0500-0000-0000-000000000000[3143]): Service exited due to SIGKILL | sent by mds[127]
Nov 12 23:47:17 MacMini com.apple.xpc.launchd[1] (com.apple.mdworker.shared.09000000-0400-0000-0000-000000000000[3144]): Service exited due to SIGKILL | sent by mds[127]
Nov 12 23:57:54 MacMini syslogd[101]: ASL Sender Statistics
Nov 12 23:58:45 MacMini com.apple.xpc.launchd[1] (com.apple.mdworker.shared.09000000-0500-0000-0000-000000000000[3150]): Service exited due to SIGKILL | sent by mds[127]
Nov 12 23:59:46 MacMini com.apple.xpc.launchd[1] (com.apple.mdworker.shared.09000000-0600-0000-0000-000000000000[3151]): Service exited due to SIGKILL | sent by mds[127]
client_loop: send disconnect: Broken pipe
I am not sure if the time of SSH disconnect is also the same time that the MacMini crashed again. Perhaps the high CPU usage just caused the SSH connections to timeout. I don't see anything weird in sabnzbd logfiles.
Grepping the log file I see these:
sabnzbd.log.5:2020-11-14 09:56:30,810::INFO::[__init__:1060] Restarting crashed scheduler
sabnzbd.log.5:2020-11-14 09:58:43,042::INFO::[__init__:1060] Restarting crashed scheduler
sabnzbd.log.5:2020-11-14 10:00:55,336::INFO::[__init__:1060] Restarting crashed scheduler
sabnzbd.log.5:2020-11-14 10:03:07,577::INFO::[__init__:1060] Restarting crashed scheduler
sabnzbd.log.5:2020-11-14 10:05:19,833::INFO::[__init__:1060] Restarting crashed scheduler
sabnzbd.log.5:2020-11-14 10:07:32,141::INFO::[__init__:1060] Restarting crashed scheduler
sabnzbd.log.5:2020-11-14 10:09:44,391::INFO::[__init__:1060] Restarting crashed scheduler
sabnzbd.log.5:2020-11-14 10:11:56,621::INFO::[__init__:1060] Restarting crashed scheduler
Looking at the bundled macOS binaries:
unrar
version is 5.91
> homebrew's unrar
is at 6.0.2
7-zip
is version 16.02
> homebrew's p7zip
is the same.par2-sl64
is version 1.0
> homebrew doesn't ship par2-sl64
, but par2
is at par2-0.8.1
.Hi Jeroen,
While SAB is downloading & unpacking/unrarring & certainly reparing (par2), the CPU can indeed go to high values. The 99% is not too alarming, as that is the CPU load per core, and you have 2 cores. So SAB is using one core.
As you show memory stays low, your title "Memory leak?" is incorrect, IMHO. Change it to "Mac unresponsive".
More worrying is the killing of MDS (“metadata server”?) and XPC ("Access a low-level (libSystem) interprocess communication mechanism that is based on serialized property lists." ... ?). You should solve that. Not a SAB related ... unless MDS is indexing all the changing files of SABnzbd. Maybe you can exclude the SAB Incomplete dir from MDS?
People are reporting problems with MDS since a certain MacOS update. Google that.
https://apple.stackexchange.com/a/144494 https://osxdaily.com/2010/08/05/mds-mac/ https://discussions.apple.com/thread/251433748 https://discussions.apple.com/thread/251403759 https://support.apple.com/en-us/HT201716 (prevent certain directories from Spotlight)
Yeah it is starting to look like this issue is not related to SABnzbd
at all, so I'll close this issue. CPU shooting to 100% should indeed be fine as it's that indeed a single core, although it's still a bit of a mystery why I am seeing these crashes. I was suspecting SABnzbd to be part of the cause because I only started seeing those crashes since I adopted the 3.1.0 beta releases. Perhaps a macOS update around the same time is the real reason this is happening, or the hardware (the MacMini is from 2012) is starting to fail (could be after 8 years).
Thanks for the suggestions though. I did exclude some folders from Spotlight indexing to see if that helps.
OK. Could you still change the title to "Mac unresponsive" to avoid false Google hits? Thanks.
I have experienced a couple of more hangs, and I have narrowed the issue down to the Direct Unpack
setting.
If I enable it, I experience the hangs and the Mac Mini becomes complete unresponsive and needs a force reboot. If I disable it, all is fine.
~This
is a screenshot of all the settings that will trigger the hang on my Mac Mini.~
Do you maybe have the log before such a hang?
Due to the issues I am experiencing I moved sabnzbd
off the Mac Mini and am now using the Synology package (3.1.0-45
) directly on my NAS. I am noticing similar behavior there. Where the NAS has run fine for years, since adding sabnzbd
I see the NAS become less responsive over time and total memory consumption increasing. As sabnzbd
is the only thing that changed on this nas, I still think you have a memory leak somewhere.
While the CPU and Network are idling, memory consumption is super high.
Back to normal again.
As Direct Unpack was enabled again, I am going to disable it and see if memory consumption goes back to normal.
I see the NAS become less responsive over time and total memory consumption increasing. As
sabnzbd
is the only thing that changed on this nas, I still think you have a memory leak somewhere.
Why do you again say "memory leak"? A process using memory is not a memory leak. Did you read https://www.linuxatemyram.com/ ?
Which Synology do you have? How much RAM In SABnzbd's upper right corner, click on the wrench symbol ("Status and interface options"), then click on first tab Status, and there click on the Refresh Arrow ... what do you get at CPU and Pystone?
Memory management is done differently on different OS. One view that memory not used is wasted. While others view it as waste if used. Memory leak can be a factor in how you perceive it, but if your not hitting swap or OOM then it's probably not one. Also stuff gets reported funny of it's called by parent, so python calling subprocess. Do a coredump and see what is in the stack?
Which Synology do you have? How much RAM
DS1815+ with Intel(R) Atom(TM) CPU C2538 @ 2.40GHz with the base amount of physical memory (e.g. 2GB). Pystone: 29979 Article cache limit: 512M Post processing nice params: -n 19
A process using memory is not a memory leak.
That is true, but if memory usage gradually increases over time without freeing memory, it does start to sound like a leak. Especially if the process is sitting idle doing nothing at all.
Like mentioned earlier I noticed having Direct Unpack
enabled pull down the Mac Mini, and I am now seeing the same with the Synology. I'm no python dev, but I see multithreading logic in directunpacker.py. From what I read local variables might not get released properly causing a memory buildup. There also appears to be an issue in Python (3.7.8 and 3.8?) that might cause a reference leak. The macOS binary appears to bundle Python 3.8
(./Contents/Resources/lib/python3.8
) and the Synology package appears to use Python 3.7.7-15
. I'll leave Direct Unpack
off for now to see what happens.
There is indeed a memory leak fixed in 3.2.0, however I don't think it should be enough to cause the NAS to become unresponsive. Unless you have finished a large number of jobs in between.
The SynoCommunity Python is always behind, and now with DSM 7 it's a bit of a mess there, so much work. So should unfortunately not expect a Python update there anytime soon.
There is indeed a memory leak fixed in 3.2.0,
Interesting, is it also related to the direct unpacker? Or someplace else?
The SynoCommunity Python is always behind, and now with DSM 7 it's a bit of a mess there, so much work.
Yeah it looks like Synology is pretty busy with DSM 7.
FYI: I upgraded the Synology from the default 2GB memory stick to 16GB. I haven't experienced any hangs since then. I wonder if #1736 was involved, I did seem to run into the hangs with many large nzbs. I got the impression that the OS was grinding to a halt because of low memory.
This issue is probably a bit harder to debug. As mentioned in #1664, ever since moving to the 3.1.0 betas and stable I am experiencing occasional situations where the MacMini (Late 2012) that is running
SABnzbd+
becomes completely unresponsive.I assume there is a memory leak somewhere, but as the issue is not reproducible with some clear steps, it is hard to pinpoint what exactly is causing it. I can only recover by performing a hard reset on the MacMini as it does not accept incoming connections at all anymore.
Looking through the logs, I do see a couple of Python exceptions that keep occurring throughout the logfile although these do not seem to be related to the 'hang'.
This one is older, so I don't think this is related to
3.1.0
final, but I am adding it anyways: