goauthentik / authentik

The authentication glue you need.
https://goauthentik.io
Other
13.77k stars 927 forks source link

Redis Cluster or other mechanisms available as ref: #3979 #5531

Open gcarrarom opened 1 year ago

gcarrarom commented 1 year ago

Is your feature request related to a problem? Please describe. It frustrates me that using Clustered Redis is not an option and Authentik goes offline - even if minimally - when doing patches on my hosts.

Describe the solution you'd like Allow for "MOVED" redis responses on the client used by authentik

Describe alternatives you've considered Use other tools like rmq as #3979 suggests and making it available in your helm chart.

BeryJu commented 1 year ago

5395 will add support for all kinds of redis connections

gcarrarom commented 1 year ago

That's great! I will watch it closely, thanks!

PKizzle commented 1 year ago

Sadly, Redis Cluster is not supported by multiple of authentik's dependencies. I have started to fix several bugs, but now it seems like major parts of those dependencies need to be re-written in order to work correctly. This will also make it more difficult to maintain this project as any dependency update may break my workarounds. The best option would be to make PRs to the respective projects and wait for the merge. However, seeing what happened to https://github.com/celery/kombu/pull/1021 my hopes are not very high. I will try to figure out a way to make it "work" though that means there might be a lot of small bugs remaining as I am far from a Redis Cluster expert.

gcarrarom commented 1 year ago

That's unfortunate. This is the only piece that is not highly available during setup of Authentik today. Even though it's a small portion, my authentik instance fails every so often when I have a node upgrade on my cluster that runs that master redis instance. What is your recommendation for a highly available instance of Authentik?

PKizzle commented 1 year ago

This is currently not really possible. While you could run multiple Authentik instances all connected to i.e. the same LDAP server I don't think that will work with the same Redis servers (but I have not tested that personally). For official support of this configuration you will have to wait for Redis HA support in the kombu, celery and Django Redis Channels projects.

PKizzle commented 1 year ago

@gcarrarom You might want to check out the current state of the Redis cluster support. All unit tests pass and from my personal testing it works in my dev environment. However, I did not yet test how well the system responds to "MOVED". Maybe this is something you could give me feedback on? Have a look at the images available at Docker Hub if you are interested.

gcarrarom commented 1 year ago

Hi @PKizzle , sorry, somehow your message was lost in the middle of all the notifications. I'll test out today and report back! Thanks a bunch for the work on that!

EDIT: I've tried the deployment using the redis cluster, just posted on your draft PR so you can check the error message here: #7118

PKizzle commented 1 year ago

@gcarrarom The error might be caused by a faulty configuration. Could you provide some more details about your setup?

gcarrarom commented 12 months ago

Hi PKizzle, sure. I'm using k3s and deploying using the official helm chart. I'm using everything default and an external postgres database. I'm also using 3 workers and 3 replicas for the authentik pods.

PKizzle commented 12 months ago

May I ask how you configured the helm chart to use Redis cluster? Did you set the environment variable AUTHENTIK_REDIS__URL to something like redis+cluster://${USERNAME}:${PASSWORD}@${HOST_1}:${PORT_1},${HOST_2}:${PORT_2},${HOST_X}:${PORT_X}/0 ?

gcarrarom commented 12 months ago

Great question! I was just using the default setup

redis:
  architecture: replication

Which then seems to setup like this:

kgsec authentik -o yaml | yq '.data.AUTHENTIK_REDIS__HOST | @base64d'
authentik-redis-master

On the changed version I had just changed the hostname, but hadn't added the redis+cluster protocol prefix. I'll try to do so by the end of the week with the newest version from your PR.

PKizzle commented 12 months ago

You will need to use AUTHENTIK_REDIS__URL. The old configuration (i.e. AUTHENTIK_REDIS__HOST) is only valid for a single Redis instance and will not be supported for any other Redis setup as it lacks flexibility. The URL configuration actually replaces the other variables if present.

vodanet commented 2 months ago

I have set up a Redis Cluster in my Kubernetes environment. The cluster appears to be healthy based on the following output:

Redis Cluster Info:

cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size:3
cluster_current_epoch:10
cluster_my_epoch:1
cluster_stats_messages_ping_sent:11781
cluster_stats_messages_pong_sent:11488
cluster_stats_messages_publish_sent:10
cluster_stats_messages_auth-ack_sent:1
cluster_stats_messages_update_sent:2
cluster_stats_messages_sent:23282
cluster_stats_messages_ping_received:11488
cluster_stats_messages_pong_received:11780
cluster_stats_messages_fail_received:2
cluster_stats_messages_publish_received:6
cluster_stats_messages_auth-req_received:1
cluster_stats_messages_received:23277
total_cluster_links_buffer_limit_exceeded:0

Authentik Helm Chart Configuration:

authentik:
  env:
    AUTHENTIK_REDIS__URL: "redis+cluster://password@redis-cluster-0.redis-cluster.svc.cluster.local:6379,redis-cluster-1.redis-cluster.svc.cluster.local:6379,redis-cluster-2.redis-cluster.svc.cluster.local:6379/0"
  secret_key: "secret_key-value"  
  error_reporting:
    enabled: true

  postgresql:
    host: "authentik-postgresql.default.svc.cluster.local"
    user: "authentik"
    password: "postgre_password"
    database: "authentik"
    port: 5432

server:
  replicas: 3

postgresql:
  enabled: false

worker:
  replicas: 2

Issue:

The Authentik server pods are returning the following error:

{"error":"dial unix /dev/shm/authentik-core.sock: connect: no such file or directory","event":"failed to proxy to backend","level":"warning","logger":"authentik.router","timestamp":"2024-09-24T04:32:43Z"}
{"error":"websocket: bad handshake","event":"failed to connect websocket","level":"warning","logger":"authentik.outpost.ak-api-controller","timestamp":"2024-09-24T04:32:43Z"}

Question:

Is the current Redis configuration in my Authentik Helm chart incorrect, or is there another underlying issue that might be causing the problem? If this configuration is incorrect, how should it be adjusted to properly integrate with the Redis Cluster?

I just want to make authentik HA in my cluster. :)