High CPU usage on worker (celery) docker container

lfdominguez commented 1 year ago

Describe the bug Right after starting up my docker-compose setup based on the given docker-compose.yml file, the worker-container causes high CPU load.

To Reproduce Steps to reproduce the behavior:

Run docker-compose up
Run docker-compose top
You'll find one worker process with high CPU load (in my setup), specific in /usr/local/bin/python /usr/local/bin/celery -A authentik.root.celery worker -Ofair --max-tasks-per-child=1 --autoscale 3,1 -E -B -s /tmp/celerybeat-schedule -Q authentik,authentik_scheduled,authentik_events
Run docker-compose stop worker
High CPU load has gone

Expected behavior I would expect the system tasks would not be fired every second or continuously and not consuming so much cpu.

Screenshots

Logs

docker compose top

authentik-postgresql-1
UID   PID     PPID   C    STIME   TTY   TIME       CMD
70    9418    9398   0    01:40   ?     00:00:00   postgres                                               
70    9607    9418   0    01:40   ?     00:00:00   postgres: checkpointer                                 
70    9608    9418   0    01:40   ?     00:00:00   postgres: background writer                            
70    9609    9418   0    01:40   ?     00:00:00   postgres: walwriter                                    
70    9610    9418   0    01:40   ?     00:00:00   postgres: autovacuum launcher                          
70    9611    9418   0    01:40   ?     00:00:00   postgres: stats collector                              
70    9612    9418   0    01:40   ?     00:00:00   postgres: logical replication launcher                 
70    11402   9418   0    01:47   ?     00:00:00   postgres: authentik authentik 172.18.0.4(46546) idle   
70    11403   9418   0    01:47   ?     00:00:00   postgres: authentik authentik 172.18.0.4(46558) idle   

authentik-redis-1
UID        PID    PPID   C    STIME   TTY   TIME       CMD
systemd+   2475   2432   0    01:13   ?     00:00:04   redis-server *:6379   

authentik-server-1
UID    PID     PPID    C    STIME   TTY   TIME       CMD
bold   11282   11261   0    01:47   ?     00:00:00   /usr/local/bin/dumb-init -- /lifecycle/ak server                                                                
bold   11338   11282   1    01:47   ?     00:00:01   authentik                                                                                                       
bold   11374   11338   1    01:47   ?     00:00:01   /usr/local/bin/python /usr/local/bin/gunicorn -c ./lifecycle/gunicorn.conf.py authentik.root.asgi:application   
bold   11378   11374   3    01:47   ?     00:00:03   /usr/local/bin/python /usr/local/bin/gunicorn -c ./lifecycle/gunicorn.conf.py authentik.root.asgi:application   
bold   11379   11374   3    01:47   ?     00:00:03   /usr/local/bin/python /usr/local/bin/gunicorn -c ./lifecycle/gunicorn.conf.py authentik.root.asgi:application   

authentik-worker-1
UID    PID     PPID    C    STIME   TTY   TIME       CMD
root   11236   11191   0    01:47   ?     00:00:00   /usr/local/bin/dumb-init -- /lifecycle/ak worker                                                                                                                                                                 
bold   11337   11236   15   01:47   ?     00:00:15   /usr/local/bin/python /usr/local/bin/celery -A authentik.root.celery worker -Ofair --max-tasks-per-child=1 --autoscale 3,1 -E -B -s /tmp/celerybeat-schedule -Q authentik,authentik_scheduled,authentik_events   
bold   11425   11337   93   01:47   ?     00:01:19   /usr/local/bin/python /usr/local/bin/celery -A authentik.root.celery worker -Ofair --max-tasks-per-child=1 --autoscale 3,1 -E -B -s /tmp/celerybeat-schedule -Q authentik,authentik_scheduled,authentik_events   
bold   11614   11337   0    01:48   ?     00:00:00   /usr/local/bin/python /usr/local/bin/celery -A authentik.root.celery worker -Ofair --max-tasks-per-child=1 --autoscale 3,1 -E -B -s /tmp/celerybeat-schedule -Q authentik,authentik_scheduled,authentik_events

docker compose logs worker

authentik-worker-1  | {"event": "Task started", "level": "info", "logger": "authentik.root.celery", "pid": 70, "task_id": "f77677cb-1c8a-416f-a195-c79b09a9e6cb", "task_name": "apply_blueprint", "timestamp": "2023-05-25T01:47:58.828675"}
authentik-worker-1  | {"event": "Task finished", "level": "info", "logger": "authentik.root.celery", "pid": 69, "state": "SUCCESS", "task_id": "5d8eb809-a982-4390-bf07-58a2cdef5570", "task_name": "apply_blueprint", "timestamp": "2023-05-25T01:47:58.845545"}
authentik-worker-1  | {"event": "Task finished", "level": "info", "logger": "authentik.root.celery", "pid": 70, "state": "SUCCESS", "task_id": "f77677cb-1c8a-416f-a195-c79b09a9e6cb", "task_name": "apply_blueprint", "timestamp": "2023-05-25T01:47:59.157652"}
authentik-worker-1  | {"event": "Task started", "level": "info", "logger": "authentik.root.celery", "pid": 71, "task_id": "501f4ca8-d6a3-40d4-805d-21b4a1fe41bc", "task_name": "apply_blueprint", "timestamp": "2023-05-25T01:47:59.670135"}
authentik-worker-1  | {"event": "Task finished", "level": "info", "logger": "authentik.root.celery", "pid": 71, "state": "SUCCESS", "task_id": "501f4ca8-d6a3-40d4-805d-21b4a1fe41bc", "task_name": "apply_blueprint", "timestamp": "2023-05-25T01:47:59.807064"}
authentik-worker-1  | {"event": "Task started", "level": "info", "logger": "authentik.root.celery", "pid": 85, "task_id": "e4886240-b169-4ddd-8d11-8cae4c6f127a", "task_name": "apply_blueprint", "timestamp": "2023-05-25T01:48:00.064981"}
authentik-worker-1  | {"event": "Task started", "level": "info", "logger": "authentik.root.celery", "pid": 86, "task_id": "e5802ca2-ac15-49b0-9493-7a66e097e47e", "task_name": "apply_blueprint", "timestamp": "2023-05-25T01:48:00.154999"}
authentik-worker-1  | {"event": "Task finished", "level": "info", "logger": "authentik.root.celery", "pid": 85, "state": "SUCCESS", "task_id": "e4886240-b169-4ddd-8d11-8cae4c6f127a", "task_name": "apply_blueprint", "timestamp": "2023-05-25T01:48:00.160639"}
authentik-worker-1  | {"event": "Task started", "level": "info", "logger": "authentik.root.celery", "pid": 87, "task_id": "58d48c16-2285-426f-a57a-0f4f6e3bf5ac", "task_name": "apply_blueprint", "timestamp": "2023-05-25T01:48:00.670035"}
authentik-worker-1  | {"entry": "BlueprintEntry(model='authentik_tenants.tenant', state=<BlueprintEntryDesiredState.CREATED: 'created'>, conditions=[], identifiers={'domain': 'authentik-default', 'default': True}, attrs={'flow_authentication': <authentik.blueprints.v1.common.Find object at 0x7f7de3e23e50>, 'flow_invalidation': <authentik.blueprints.v1.common.Find object at 0x7f7de3e0c650>, 'flow_user_settings': <authentik.blueprints.v1.common.Find object at 0x7f7de3e0c410>}, id=None, _state=BlueprintEntryState(instance=None))", "error": "EntryInvalidError(\"Serializer errors {'non_field_errors': [ErrorDetail(string='Only a single Tenant can be set as default.', code='invalid')]}\")", "event": "entry invalid: Serializer errors {'non_field_errors': [ErrorDetail(string='Only a single Tenant can be set as default.', code='invalid')]}", "level": "warning", "log_level": "warning", "logger": "authentik.blueprints.v1.importer", "pid": 86, "task_id": "task-e5802ca2ac1549b094937a66e097e47e", "timestamp": "2023-05-25T01:48:01.146049"}
authentik-worker-1  | {"event": "Task finished", "level": "info", "logger": "authentik.root.celery", "pid": 86, "state": "SUCCESS", "task_id": "e5802ca2-ac15-49b0-9493-7a66e097e47e", "task_name": "apply_blueprint", "timestamp": "2023-05-25T01:48:01.153519"}
authentik-worker-1  | {"event": "Task finished", "level": "info", "logger": "authentik.root.celery", "pid": 87, "state": "SUCCESS", "task_id": "58d48c16-2285-426f-a57a-0f4f6e3bf5ac", "task_name": "apply_blueprint", "timestamp": "2023-05-25T01:48:01.853897"}
authentik-worker-1  | {"event": "Task started", "level": "info", "logger": "authentik.root.celery", "pid": 88, "task_id": "00406748-2ebb-4246-9de5-97666e98739d", "task_name": "apply_blueprint", "timestamp": "2023-05-25T01:48:01.989098"}
authentik-worker-1  | {"event": "Task started", "level": "info", "logger": "authentik.root.celery", "pid": 89, "task_id": "6bbd1313-24ba-4ec9-b298-9a52782fc8fa", "task_name": "apply_blueprint", "timestamp": "2023-05-25T01:48:02.060017"}
authentik-worker-1  | {"event": "Task finished", "level": "info", "logger": "authentik.root.celery", "pid": 89, "state": "SUCCESS", "task_id": "6bbd1313-24ba-4ec9-b298-9a52782fc8fa", "task_name": "apply_blueprint", "timestamp": "2023-05-25T01:48:02.187476"}
authentik-worker-1  | {"event": "Task finished", "level": "info", "logger": "authentik.root.celery", "pid": 88, "state": "SUCCESS", "task_id": "00406748-2ebb-4246-9de5-97666e98739d", "task_name": "apply_blueprint", "timestamp": "2023-05-25T01:48:02.295172"}
authentik-worker-1  | {"event": "Task started", "level": "info", "logger": "authentik.root.celery", "pid": 90, "task_id": "db68b550-839c-491c-b64f-ebafc443e003", "task_name": "apply_blueprint", "timestamp": "2023-05-25T01:48:02.469729"}
authentik-worker-1  | {"event": "Task started", "level": "info", "logger": "authentik.root.celery", "pid": 91, "task_id": "693254c4-d4a1-4fce-bde8-c27c85cdc942", "task_name": "apply_blueprint", "timestamp": "2023-05-25T01:48:03.076773"}
authentik-worker-1  | {"entry": "BlueprintEntry(model='authentik_tenants.tenant', state=<BlueprintEntryDesiredState.CREATED: 'created'>, conditions=[], identifiers={'domain': 'authentik-default', 'default': True}, attrs={'flow_authentication': <authentik.blueprints.v1.common.Find object at 0x7f7de3dfef90>, 'flow_invalidation': <authentik.blueprints.v1.common.Find object at 0x7f7de3dff6d0>, 'flow_user_settings': <authentik.blueprints.v1.common.Find object at 0x7f7de3dfdcd0>}, id=None, _state=BlueprintEntryState(instance=None))", "error": "EntryInvalidError(\"Serializer errors {'non_field_errors': [ErrorDetail(string='Only a single Tenant can be set as default.', code='invalid')]}\")", "event": "entry invalid: Serializer errors {'non_field_errors': [ErrorDetail(string='Only a single Tenant can be set as default.', code='invalid')]}", "level": "warning", "log_level": "warning", "logger": "authentik.blueprints.v1.importer", "pid": 90, "task_id": "task-db68b550839c491cb64febafc443e003", "timestamp": "2023-05-25T01:48:03.256078"}
authentik-worker-1  | {"event": "Task finished", "level": "info", "logger": "authentik.root.celery", "pid": 90, "state": "SUCCESS", "task_id": "db68b550-839c-491c-b64f-ebafc443e003", "task_name": "apply_blueprint", "timestamp": "2023-05-25T01:48:03.268415"}
authentik-worker-1  | {"event": "Task started", "level": "info", "logger": "authentik.root.celery", "pid": 92, "task_id": "28fca1c2-5554-4bdc-a49e-c8a459b71bba", "task_name": "apply_blueprint", "timestamp": "2023-05-25T01:48:03.594549"}
authentik-worker-1  | {"entry": "BlueprintEntry(model='authentik_tenants.tenant', state=<BlueprintEntryDesiredState.CREATED: 'created'>, conditions=[], identifiers={'domain': 'authentik-default', 'default': True}, attrs={'flow_authentication': <authentik.blueprints.v1.common.Find object at 0x7f7de3e26ad0>, 'flow_invalidation': <authentik.blueprints.v1.common.Find object at 0x7f7de3e27290>, 'flow_user_settings': <authentik.blueprints.v1.common.Find object at 0x7f7de3e25710>}, id=None, _state=BlueprintEntryState(instance=None))", "error": "EntryInvalidError(\"Serializer errors {'non_field_errors': [ErrorDetail(string='Only a single Tenant can be set as default.', code='invalid')]}\")", "event": "entry invalid: Serializer errors {'non_field_errors': [ErrorDetail(string='Only a single Tenant can be set as default.', code='invalid')]}", "level": "warning", "log_level": "warning", "logger": "authentik.blueprints.v1.importer", "pid": 91, "task_id": "task-693254c4d4a14fcebde8c27c85cdc942", "timestamp": "2023-05-25T01:48:04.146306"}
authentik-worker-1  | {"event": "Task finished", "level": "info", "logger": "authentik.root.celery", "pid": 91, "state": "SUCCESS", "task_id": "693254c4-d4a1-4fce-bde8-c27c85cdc942", "task_name": "apply_blueprint", "timestamp": "2023-05-25T01:48:04.159144"}
authentik-worker-1  | {"event": "Task finished", "level": "info", "logger": "authentik.root.celery", "pid": 92, "state": "SUCCESS", "task_id": "28fca1c2-5554-4bdc-a49e-c8a459b71bba", "task_name": "apply_blueprint", "timestamp": "2023-05-25T01:48:04.822494"}
authentik-worker-1  | {"event": "Task started", "level": "info", "logger": "authentik.root.celery", "pid": 93, "task_id": "43525561-39e0-4825-b65f-b01d85604723", "task_name": "apply_blueprint", "timestamp": "2023-05-25T01:48:04.962862"}
authentik-worker-1  | {"event": "Task started", "level": "info", "logger": "authentik.root.celery", "pid": 94, "task_id": "be35f8ca-3f03-4871-8d3c-cdc8dcdac84a", "task_name": "apply_blueprint", "timestamp": "2023-05-25T01:48:05.054453"}
authentik-worker-1  | {"event": "Task finished", "level": "info", "logger": "authentik.root.celery", "pid": 94, "state": "SUCCESS", "task_id": "be35f8ca-3f03-4871-8d3c-cdc8dcdac84a", "task_name": "apply_blueprint", "timestamp": "2023-05-25T01:48:05.187822"}
authentik-worker-1  | {"event": "Task finished", "level": "info", "logger": "authentik.root.celery", "pid": 93, "state": "SUCCESS", "task_id": "43525561-39e0-4825-b65f-b01d85604723", "task_name": "apply_blueprint", "timestamp": "2023-05-25T01:48:05.275860"}
authentik-worker-1  | {"event": "Task started", "level": "info", "logger": "authentik.root.celery", "pid": 95, "task_id": "ccdb0b1e-1d29-4de6-b0e7-cf2c2972dd97", "task_name": "apply_blueprint", "timestamp": "2023-05-25T01:48:06.125607"}
authentik-worker-1  | {"event": "Task started", "level": "info", "logger": "authentik.root.celery", "pid": 96, "task_id": "10be8266-abee-4ffd-9fa5-3ae12a4efebb", "task_name": "apply_blueprint", "timestamp": "2023-05-25T01:48:06.127652"}
authentik-worker-1  | {"event": "Task finished", "level": "info", "logger": "authentik.root.celery", "pid": 95, "state": "SUCCESS", "task_id": "ccdb0b1e-1d29-4de6-b0e7-cf2c2972dd97", "task_name": "apply_blueprint", "timestamp": "2023-05-25T01:48:06.239886"}
authentik-worker-1  | {"event": "Task finished", "level": "info", "logger": "authentik.root.celery", "pid": 96, "state": "SUCCESS", "task_id": "10be8266-abee-4ffd-9fa5-3ae12a4efebb", "task_name": "apply_blueprint", "timestamp": "2023-05-25T01:48:06.266217"}
authentik-worker-1  | {"event": "Task started", "level": "info", "logger": "authentik.root.celery", "pid": 99, "task_id": "78ff0dd2-0248-40e0-815c-e1171b3cc92a", "task_name": "scim_signal_direct", "timestamp": "2023-05-25T01:50:44.094887"}
authentik-worker-1  | {"event": "Task finished", "level": "info", "logger": "authentik.root.celery", "pid": 99, "state": "SUCCESS", "task_id": "78ff0dd2-0248-40e0-815c-e1171b3cc92a", "task_name": "scim_signal_direct", "timestamp": "2023-05-25T01:50:44.134690"}

Version and Deployment (please complete the following information):

authentik version: 2023.5.1
Deployment: docker-compose

Additional context I tested some other “fixes” in other issues (closed already), like down to 2023.2.1, applying a patch of a dev version, etc.. Nothing works.

lfdominguez commented 1 year ago

"magically" the CPU is not using the 100% anymore.....

soerendohmen commented 1 year ago

I also have this problem, but on kubernetes.

lvlts commented 1 year ago

I also have this problem, on Docker (Compose), with the default / provided yml file in the documentation

soerendohmen commented 1 year ago

what helped me: kill the worker processes, since then everything is fine

lvlts commented 1 year ago

@soerendohmen that's kind of outside the point, as with docker/kubernetes your containers will eventually get restarted. all that automation gone to the bin, as I would have to kill the CPU hogging process on every authentik version update, on every container restart/recreate, and on every reboot of my host system.

soerendohmen commented 1 year ago

Yes, of course, this is no solution. I didn't want to do this, but there were 3 workers consuming 100% CPU. The alternative would have been shutting down the service. So I tried and it worked out. Even the "change password" Mails, which I tried to send before, came trough on that moment. So something hung, but I wasn't able to figure out, what.

petefinesse commented 1 year ago

Same here

JensDeLeersnyderPXL commented 1 year ago

I am also experiencing this issue and even when stopping this process Authentik keeps working

BeryJu commented 1 year ago

I'm reasonably sure there's some logic error somewhere that causes the blueprint tasks to recursively re-trigger which causes the high CPU usage, however the high CPU usage is to be expected on the first startup, but only for a couple of minutes while all the initial setup is done

bachrc commented 1 year ago

I am experiencing it too, on 2023.6

regalialong commented 1 year ago

Same behavior here, on upgraded 2023.6.1 instance and on locally tested fresh beta version. It doesn't go away after startup, trace logging sadly doesn't show anything extraordinary.

BenLangers commented 1 year ago

same here. Not constantly at 100% anymore since upgrade 2023.6.0 > 2023.6.1. One process: "/lifecycle/ak worker", hovering around 85% CPU. The rest of the host is at least workable again, but seems very high for a single user test setup...

Notifications are being flooded with:

System task exception
Task notification_transport encountered an error: Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/celery/app/trace.py", line 451, in trace_task
    R = retval = fun(*args, **kwargs)
                 ^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/celery/app/trace.py", line 734, in __protected_call__
    return self.run(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/celery/app/autoretry.py", line 54, in run
    ret = task.retry(exc=exc, **retry_kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/celery/app/task.py", line 717, in retry
    raise_with_context(exc)
  File "/usr/local/lib/python3.11/site-packages/celery/app/autoretry.py", line 34, in run
    return task._orig_run(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/authentik/events/tasks.py", line 129, in notification_transport
    raise exc
  File "/authentik/events/tasks.py", line 125, in notification_transport
    transport.send(notification)
  File "/authentik/events/models.py", line 331, in send
    return self.send_email(notification)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/authentik/events/models.py", line 472, in send_email
    raise NotificationTransportError(exc) from exc
authentik.events.models.NotificationTransportError: [Errno 99] Cannot assign requested address

BenLangers commented 1 year ago

recreating the redis container made a difference. Now have two processes constantly between 60% - 85% both: /usr/local/bin/python /usr/local/bin/gunicorn -c ./lifecycle/gunicorn.conf.py authentik.root.asgi:application

The system is still being built and is never under heavy load. At peak times it processes maybe 5 logins per minute.

The log of the worker container shows an endless stream of:

INF event=Task finished logger=authentik.root.celery pid=171830 state=SUCCESS task_id=cd3af25a-cae2-4caf-9080-56602967ac04 task_name=scim_signal_direct timestamp=2023-07-18T21:26:17.970770

INF event=Task started logger=authentik.root.celery pid=171831 task_id=2b35a65d-1c90-463e-84ff-6b2fa10c5b64 task_name=scim_signal_direct timestamp=2023-07-18T21:26:34.099977

INF event=Task finished logger=authentik.root.celery pid=171831 state=SUCCESS task_id=2b35a65d-1c90-463e-84ff-6b2fa10c5b64 task_name=scim_signal_direct timestamp=2023-07-18T21:26:34.119047

INF event=Task started logger=authentik.root.celery pid=171832 task_id=7a633ade-1496-44f6-964f-309dceb04577 task_name=blueprints_discovery timestamp=2023-07-18T21:26:35.894243

INF event=Task started logger=authentik.root.celery pid=171833 task_id=4daa90db-1caf-453e-b315-9fd071b43c00 task_name=clear_failed_blueprints timestamp=2023-07-18T21:26:35.977922

INF event=Task finished logger=authentik.root.celery pid=171833 state=SUCCESS task_id=4daa90db-1caf-453e-b315-9fd071b43c00 task_name=clear_failed_blueprints timestamp=2023-07-18T21:26:36.017502

INF event=Task finished logger=authentik.root.celery pid=171832 state=SUCCESS task_id=7a633ade-1496-44f6-964f-309dceb04577 task_name=blueprints_discovery timestamp=2023-07-18T21:26:36.163187

There are also 140+k notifications pending. 😒 No new ones are being added in the mean time. It looks like it are all the same as mentioned in the previous comment. I can remove them one by one, but the 'Clear All' button kicks of some other processes (DB SELECT) that seem to linger endlessly eating the other two cores assigned to the host. Have had it running for over a day but never known it to finish and the number of notifications remains the same. Could I possibly scrap those in the database or something?

jordankoehn commented 12 months ago

For me, 'python -m manage worker' is pinned to 99 cpu. I just installed authentik yesterday and am the only user thus far. No other application is using nearly as much cpu as this.

Turning on debug logging prints nothing after the initial startup, when no activity should be occurring

BenLangers commented 12 months ago

Couple of days later, those processes seem to have cooled down. Notifications stopped adding up around 280k. Haven't been able to clear them, the host has about 7GB of available RAM, but runs out before the select statement on the database finishes... Will retry with other containers shut down later. Will keep an eye on performance, but at the moment, clear sailing,

jordankoehn commented 12 months ago

Same, mine also cooled off after a few hours

oxtn commented 10 months ago

FWIW this just happened to me and all of the alerts appear to be stored in the authentik_events_notification table.

I ended up issuing a TRUNCATE TABLE authentik_events_notification; just to nuke every notification and start fresh, but you could also issue an update and flip the seen flag from f to t.

sammcj commented 9 months ago

(FYI I went to truncate authentik_events_notification but it's empty so that can't be the problem for everyone).

Did anyone ever get to the bottom of this?

I've tried deploying Authentik several times but keep getting blocked because of how poorly it performs due to the worker container pegging whatever CPU we throw at it at 100%.

This problem seems to be wide spread with both open and closed tickets.

I've also started a discussion on Discord as not getting much traction on Github: https://discord.com/channels/809154715984199690/1166120303341084693/1166120303341084693

cenkalti commented 9 months ago

I also experienced this. I turned of the worker container until devs fix this issue.

I have a pretty basic setup for using on my homelab. I already had set up my apps and users and my workflows seems to working fine without the worker is running.

sammcj commented 8 months ago

As per https://github.com/goauthentik/authentik/issues/7025#issuecomment-1828894097 this is still a major issue.

mobiledude commented 7 months ago

any tips/tricks how to deal with this on Unraid running authentik via templates?

tonydm commented 7 months ago

Same CPU issue... New deployment 12/18/2023, :latest of all images. It was suggested by @BeryJu that the worker will settle down after a few min. Not in my case. I also see a huge increase in disk activity. This makes SSH sessions as well as ALL other running services respond so slowly that it effectively takes them all down. I will try deploying without the worker as using with the documented and suggested deployment, to include the worker, is untenable. Where the CPU and Disk I/O drop off on the graph below is where I shut the stack down.

Selection_079

mobiledude commented 7 months ago

Switched off worker here for a couple of weeks now. waiting for a workable update.

Update: updated to Release 2023.10.5. Same issue.

vijaymodha commented 7 months ago

DKing from the Discord discussion from https://github.com/goauthentik/authentik/issues/5746#issuecomment-1775995668 posted a link to a PR that fixed my high CPU usage: https://github.com/goauthentik/authentik/pull/7762

mobiledude commented 6 months ago

Switched off worker here for a couple of weeks now. waiting for a workable update.

Update: updated to Release 2023.10.5. Same issue.

Reporting back (unraid-solved): In hind side I did 3 things, not sure what solved it. 1) in the Unraid template I added "-ulimit nofile=10240:10240" in Extra Parameters field as flag (advanced view) 2) redeployed (removing containers and images) both worker and authentik. 3) added AUTHENTIK_REDIS__DB:1 as variable to the unraid template for both Worker and authentik. Now everything seems normal.

authentik-automation[bot] commented 4 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

soerendohmen commented 4 months ago

I recon this should not be closed, as it is not solved...

rissson commented 4 months ago

For anyone that still has this issue, please check Jens' latest comment on #7762. We have made some improvements in 2024.2.2 that should prevent this from happening.

authentik-automation[bot] commented 2 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

schinkelg commented 1 month ago

I too have this issue. The configuration is exactly the one from the docker compose documentation. 100% celeryd cpu usage. Vanilla and up-to-date Archlinux install using docker-compose. AUTHENTIK_TAG:-2024.4.2

zengateway commented 1 month ago

I too have this issue. The configuration is exactly the one from the docker compose documentation. 100% celeryd cpu usage. Vanilla and up-to-date Archlinux install using docker-compose. AUTHENTIK_TAG:-2024.4.2

same issue for me

gKits commented 1 week ago

I have the same issue with the docker image 2024.6.0

sjacobflaherty commented 3 days ago

same issue here.

ATHENTIK_TAG=2024.6.1

gKits commented 3 days ago

I have the same issue with the docker image 2024.6.0

Seems like there is some weird unexpected behaviour. I was able to make the worker start and to reach the initial setup endpoint after a couple of restarts and waiting about 5-10min. The log messages did not change but I was eventually start the setup process. I did not change any configs.

goauthentik / authentik

High CPU usage on worker (celery) docker container #5746