caronc / apprise-api

A lightweight REST framework that wraps the Apprise Notification Library
https://hub.docker.com/r/caronc/apprise
MIT License
580 stars 50 forks source link

Apprise eats tons of resources for what it does? #117

Closed YouveGotMeowxy closed 5 months ago

YouveGotMeowxy commented 1 year ago

:question: Question

Seems like 400mb is a lot for the type of app Apprise is(?) and when I keep bumping up it's mem allowance 100mb at a time it just eats all of that up as well. I'm not even sending messages, it's basically just idle, and it looks like this (for example):

image

In this sample screenshot, it's eating 53% cpu, and almost all of it's 400mb memory allowance.

Also, there are a lot of warnings, terminations, timeouts, etc. in my log; is this how it's supposed to normally look (this is just a small section of the log. it's pretty much all like this though)?

04/24/2023 12:30:57 PM
[2023-04-24 12:30:57 -0500] [9221] [INFO] Booting worker with pid: 9221
04/24/2023 12:30:57 PM
[2023-04-24 12:30:57 -0500] [7] [WARNING] Worker with pid 9195 was terminated due to signal 9
04/24/2023 12:30:57 PM
[2023-04-24 12:30:57 -0500] [7] [WARNING] Worker with pid 9198 was terminated due to signal 9
04/24/2023 12:30:57 PM
[2023-04-24 12:30:57 -0500] [7] [WARNING] Worker with pid 9199 was terminated due to signal 9
04/24/2023 12:30:57 PM
[2023-04-24 12:30:57 -0500] [7] [WARNING] Worker with pid 9200 was terminated due to signal 9
04/24/2023 12:30:57 PM
[2023-04-24 12:30:57 -0500] [9222] [INFO] Booting worker with pid: 9222
04/24/2023 12:30:58 PM
[2023-04-24 12:30:57 -0500] [9223] [INFO] Booting worker with pid: 9223
04/24/2023 12:30:58 PM
[2023-04-24 12:30:58 -0500] [9224] [INFO] Booting worker with pid: 9224
04/24/2023 12:30:58 PM
[2023-04-24 12:30:58 -0500] [9225] [INFO] Booting worker with pid: 9225
04/24/2023 12:30:58 PM
[2023-04-24 12:30:58 -0500] [9226] [INFO] Booting worker with pid: 9226
04/24/2023 12:31:00 PM
[2023-04-24 12:31:00 -0500] [7] [CRITICAL] WORKER TIMEOUT (pid:9202)
04/24/2023 12:31:00 PM
[2023-04-24 12:31:00 -0500] [9202] [INFO] Worker exiting (pid: 9202)
04/24/2023 12:31:19 PM
[2023-04-24 12:31:02 -0500] [7] [WARNING] Worker with pid 9202 was terminated due to signal 9
04/24/2023 12:31:19 PM
[2023-04-24 12:31:02 -0500] [9227] [INFO] Booting worker with pid: 9227
04/24/2023 12:31:19 PM
[2023-04-24 12:31:16 -0500] [7] [WARNING] Worker with pid 9204 was terminated due to signal 9
04/24/2023 12:31:19 PM
[2023-04-24 12:31:13 -0500] [7] [CRITICAL] WORKER TIMEOUT (pid:9203)
04/24/2023 12:31:19 PM
[2023-04-24 12:31:16 -0500] [7] [CRITICAL] WORKER TIMEOUT (pid:9205)
04/24/2023 12:31:19 PM
[2023-04-24 12:31:16 -0500] [7] [CRITICAL] WORKER TIMEOUT (pid:9206)
04/24/2023 12:31:19 PM
[2023-04-24 12:31:16 -0500] [7] [CRITICAL] WORKER TIMEOUT (pid:9207)
04/24/2023 12:31:19 PM
[2023-04-24 12:31:16 -0500] [7] [CRITICAL] WORKER TIMEOUT (pid:9208)
04/24/2023 12:31:19 PM
[2023-04-24 12:31:16 -0500] [9207] [INFO] Worker exiting (pid: 9207)
04/24/2023 12:31:19 PM
[2023-04-24 12:31:16 -0500] [9203] [INFO] Worker exiting (pid: 9203)
04/24/2023 12:31:19 PM
[2023-04-24 12:31:16 -0500] [9208] [INFO] Worker exiting (pid: 9208)
04/24/2023 12:31:19 PM
[2023-04-24 12:31:16 -0500] [9206] [INFO] Worker exiting (pid: 9206)
04/24/2023 12:31:19 PM
[2023-04-24 12:31:16 -0500] [9205] [INFO] Worker exiting (pid: 9205)
04/24/2023 12:31:19 PM
[2023-04-24 12:31:16 -0500] [9228] [INFO] Booting worker with pid: 9228
04/24/2023 12:31:19 PM
[2023-04-24 12:31:18 -0500] [7] [WARNING] Worker with pid 9205 was terminated due to signal 9
04/24/2023 12:31:19 PM
[2023-04-24 12:31:18 -0500] [7] [WARNING] Worker with pid 9206 was terminated due to signal 9
04/24/2023 12:31:19 PM
[2023-04-24 12:31:18 -0500] [7] [WARNING] Worker with pid 9207 was terminated due to signal 9
04/24/2023 12:31:19 PM
[2023-04-24 12:31:18 -0500] [7] [WARNING] Worker with pid 9208 was terminated due to signal 9
04/24/2023 12:31:19 PM
[2023-04-24 12:31:18 -0500] [7] [WARNING] Worker with pid 9203 was terminated due to signal 9
04/24/2023 12:31:19 PM
[2023-04-24 12:31:18 -0500] [9229] [INFO] Booting worker with pid: 9229
04/24/2023 12:31:19 PM
[2023-04-24 12:31:18 -0500] [9230] [INFO] Booting worker with pid: 9230
04/24/2023 12:31:19 PM
[2023-04-24 12:31:18 -0500] [9231] [INFO] Booting worker with pid: 9231
04/24/2023 12:31:19 PM
[2023-04-24 12:31:19 -0500] [9232] [INFO] Booting worker with pid: 9232
04/24/2023 12:31:19 PM
[2023-04-24 12:31:19 -0500] [9233] [INFO] Booting worker with pid: 9233
04/24/2023 12:31:20 PM
[2023-04-24 12:31:20 -0500] [7] [CRITICAL] WORKER TIMEOUT (pid:9209)
04/24/2023 12:31:20 PM
[2023-04-24 12:31:20 -0500] [9209] [INFO] Worker exiting (pid: 9209)
04/24/2023 12:31:22 PM
[2023-04-24 12:31:22 -0500] [7] [WARNING] Worker with pid 9209 was terminated due to signal 9
04/24/2023 12:31:22 PM
[2023-04-24 12:31:22 -0500] [9234] [INFO] Booting worker with pid: 9234
04/24/2023 12:31:44 PM
[2023-04-24 12:31:42 -0500] [7] [WARNING] Worker with pid 9220 was terminated due to signal 9
04/24/2023 12:31:44 PM
[2023-04-24 12:31:36 -0500] [7] [CRITICAL] WORKER TIMEOUT (pid:9210)
04/24/2023 12:31:44 PM
[2023-04-24 12:31:42 -0500] [7] [CRITICAL] WORKER TIMEOUT (pid:9211)
04/24/2023 12:31:44 PM
[2023-04-24 12:31:42 -0500] [9211] [INFO] Worker exiting (pid: 9211)
04/24/2023 12:31:44 PM
[2023-04-24 12:31:42 -0500] [9210] [INFO] Worker exiting (pid: 9210)
04/24/2023 12:31:44 PM
[2023-04-24 12:31:43 -0500] [9235] [INFO] Booting worker with pid: 9235
04/24/2023 12:31:44 PM
[2023-04-24 12:31:43 -0500] [7] [WARNING] Worker with pid 9210 was terminated due to signal 9
04/24/2023 12:31:44 PM
[2023-04-24 12:31:43 -0500] [7] [WARNING] Worker with pid 9211 was terminated due to signal 9
04/24/2023 12:31:44 PM
[2023-04-24 12:31:43 -0500] [9236] [INFO] Booting worker with pid: 9236
04/24/2023 12:31:44 PM
[2023-04-24 12:31:43 -0500] [9237] [INFO] Booting worker with pid: 9237
caronc commented 1 year ago

That's interesting, I've not seen that before. It's just a Django website at the end of the day. There isn't even a database backend (very light weight)

YouveGotMeowxy commented 1 year ago

That's interesting, I've not seen that before. It's just a Django website at the end of the day. There isn't even a database backend (very light weight)

Just to see what would happen I opened up the floodgate and allowed apprise to use up to 2GB of ram, and it's "idling at around 750MB

image

caronc commented 12 months ago

I added some improvements (i think) based on googling around as this error you're getting isn't uncommon at all.

Most results point to the issue being the system's resources that is hosting the service.

The new changes add some jitter control to help clean up the memory for workers a bit better and restart them in such a way that they're staggered to avoid causing any issues. I got this from a reported bug with another program here.

I also added the APPRISE_WORKER_TIMEOUT environment variable for you to play with if you want. I increased the default value i had (90s to 300s) which should give workers plenty of time to action their requests before timing out (especially if they've been tasked to notify a lot of end points).

I'd also let you know that you can control the number of workers that load by leveraging APPRISE_WORKER_COUNT. The default is dynamically calculated based on the server it's ran on (and as suggested by the Gunicorn website) (no cpus) *2 + 1. But feel free to over-ride this if you wish and fix it to a lower value such as 4 or 3.

The final change done with this commit is change how the workers operate. Previously they used sync (the default mode). But gevent has been documented to be so much faster and light weight; so it's introduced here as well.

I'll be curious on your thoughts or if you did any troubleshooting of your own that was successful?

YouveGotMeowxy commented 12 months ago

Thanks, happy to see that you're still on it! :)

TBH I haven't really messed with it much since this, as I have so many other things going on, I was just hoping that leaving the issue would be enough on my end, lol.

I will see how things go and report back after coming to some sort of conclusion!

caronc commented 12 months ago

Sounds good; i haven't made a new version so you'll need to use :edge release to pull the latest changes.

YouveGotMeowxy commented 12 months ago

ok, grabbing it now!

Haven't updated yet, but just looked ad what it's doing at the moment; look at the memory, lol:

image

Almost a gig!

YouveGotMeowxy commented 12 months ago

ok, I just loaded up:

www-data@apprise:/opt/apprise$ apprise --version
Apprise v1.4.5
Copyright (C) 2023 Chris Caron <lead2gold@gmail.com>
This code is licensed under the BSD License.

and still using a lot of mem:

image

:/

One small request; would it also be possible to add the Apprise version number at the beginning of the log? It would be easier in cases like this for example, to be able to just copy it for pasting rather than having to ssh in and do a apprise --version each time. :)

YouveGotMeowxy commented 12 months ago

I lowered the worker count to 3 as a test, as my apprise doesnt get used too much, so I'll see hwo it goes. Memory is a bit more reasonable now:

image

One question; and I overshooting my memory expectations? To me it seems like an idle thread should take "virtually" no memory if it's doing nothing at all. Perhaps only a MB or 2 per process (like an individual worker)? So for example, if I have 3 workers, only maybe 6MB for them all?

caronc commented 11 months ago

I'm really a loss here as as you said, it's just a django website with an nginx server in front of it. There is no other configuration. Hopefully someone will see this thread and possibly offer insight.

I know you can start your docker containers up with constricted resources. You could try doing this in your environment.

rasmusson commented 7 months ago

Seeing same thing here, about 700 mb memory usage

rasmusson commented 7 months ago

Seeing same thing here, about 700 mb memory usage

Managed to get it down by setting workers to 3 as above

caronc commented 7 months ago

Glad to hear. Perhaps the outcome of this is to not automatically generate the worker count based on the number of cpu cores (as documented by gunicorn), but fix it to a lower default value. It's set to a production setting now.

Apprise grants you the ability to override this which you're all leveraging here. So perhaps leaving it the way it is ideal, but instead just better document this memory issue you're having (and others too). The app consumes a small amount of memory, but your systems have a high core count(my guess) which makes it seem unreasonable? Is it fair to say that?

Thoughts?

martadinata666 commented 7 months ago

Imho, this is about the expectation of usage vs resources. Let's assume I use 1000 requests per day. Do I need 6x (my PC only had 6 cores) CPU resources and memories? Even one is more than enough. That's my credit.

rasmusson commented 7 months ago

Glad to hear. Perhaps the outcome of this is to not automatically generate the worker count based on the number of cpu cores (as documented by gunicorn), but fix it to a lower default value. It's set to a production setting now.

Apprise grants you the ability to override this which you're all leveraging here. So perhaps leaving it the way it is ideal, but instead just better document this memory issue you're having (and others too). The app consumes a small amount of memory, but your systems have a high core count(my guess) which makes it seem unreasonable? Is it fair to say that?

Thoughts?

Yeah, seems reasonable

caronc commented 5 months ago

I took @rasmusson agreement and @martadinata666's opinion and updated the README.md documentation accordingly.

I think this should cover off this ticket unless there is any disagreement?

Thoughts?

caronc commented 5 months ago

Closing off issue! Thanks for your input everyone!