nats-io / nats-server

High-Performance server for NATS.io, the cloud and edge native messaging system.
https://nats.io
Apache License 2.0
15.27k stars 1.37k forks source link

Allow multiple users for cluster authorization #3490

Closed cspalding closed 1 year ago

cspalding commented 1 year ago

Feature Request

Use Case:

Slack conversation for reference: https://natsio.slack.com/archives/C069GSYFP/p1663871439100069.

I'm setting up a NATS cluster and I need to be able to rotate credentials from time to time. I'm running NATS on kubernetes via Helm. Currently, NATS only allows for a single user for cluster authorization. Because of this constraint, I am forced to accept downtime whenever I rotate my cluster credentials. While I can rotate reasonably quickly to minimize downtime, the fact that there's downtime at all means that I have to make a particularly detailed plan for cred rotation.

If it were possible to use multiple users for cluster authorization, I wouldn't need any downtime to rotate credentials! That would be really nice and it would reduce my stress substantially!

Proposed Change:

Allow the cluster authorization config to define multiple users instead of a single user.

Who Benefits From The Change(s)?

Anyone who:

  1. runs NATS in clustered mode, and
  2. needs a mechanism to rotates their credentials, and
  3. doesn't want to trigger a reconnect between NATS servers while they rotate creds

Alternative Approaches

I'm not familiar with alternative approaches to solve this problem. It's possible that we could use something like a super cluster for this use case? If we had a super cluster made up of 2 NATS clusters we might be able to treat the second cluster as a 'fail-over'/cred-rotation cluster. If that might work I'm open to input but I haven't thought it through all the way.

caleblloyd commented 1 year ago

I think this 2-phase update works. Example is changing password from pwd to pwd2

NATS Config:

Phase 1:

cluster {
  port: 6222
  name: nats
  authorization {
    user: foo
    password: pwd2
  }
  routes: [
    nats://foo:pwd@nats-0.nats.default.svc.cluster.local:6222,
    nats://foo:pwd@nats-1.nats.default.svc.cluster.local:6222,
    nats://foo:pwd@nats-2.nats.default.svc.cluster.local:6222,
    nats://foo:pwd2@nats-0.nats.default.svc.cluster.local:6222,
    nats://foo:pwd2@nats-1.nats.default.svc.cluster.local:6222,
    nats://foo:pwd2@nats-2.nats.default.svc.cluster.local:6222
  ]
}

Send config reload signal to each server

Phase 2:

cluster {
  port: 6222
  name: nats
  authorization {
    user: foo
    password: pwd2
  }
  routes: [
    nats://foo:pwd2@nats-0.nats.default.svc.cluster.local:6222,
    nats://foo:pwd2@nats-1.nats.default.svc.cluster.local:6222,
    nats://foo:pwd2@nats-2.nats.default.svc.cluster.local:6222
  ]
}

Send config reload signal to each server

Helm Config

Phase 1:

cluster:
  enabled: true
  replicas: 3
  noAdvertise: false

  # Explicitly set routes for clustering.
  # When JetStream is enabled, the serverName must be unique in the cluster.
  extraRoutes:
  - nats://foo:pwd@nats-0.nats.default.svc.cluster.local:6222
  - nats://foo:pwd@nats-1.nats.default.svc.cluster.local:6222
  - nats://foo:pwd@nats-2.nats.default.svc.cluster.local:6222

  authorization:
    user: foo
    password: pwd2

Perform Helm upgrade and wait for it to complete

Phase 2:

cluster:
  enabled: true
  replicas: 3
  noAdvertise: false

  # Explicitly set routes for clustering.
  # When JetStream is enabled, the serverName must be unique in the cluster.
  extraRoutes: []

  authorization:
    user: foo
    password: pwd2

Perform Helm upgrade and wait for it to complete

caleblloyd commented 1 year ago

I think we could make it a little easier with the Helm Chart, opened this to track:

https://github.com/nats-io/k8s/issues/574

cspalding commented 1 year ago

Awesome, thanks a ton Caleb! I appreciate the help and the example.