StackStorm / st2

StackStorm (aka "IFTTT for Ops") is event-driven automation for auto-remediation, incident responses, troubleshooting, deployments, and more for DevOps and SREs. Includes rules engine, workflow, 160 integration packs with 6000+ actions (see https://exchange.stackstorm.org) and ChatOps. Installer at https://docs.stackstorm.com/install/index.html
https://stackstorm.com/
Apache License 2.0
6.05k stars 745 forks source link

Feature Proposal - New Auth Backend - st2-auth-backend-fallback #4216

Open nmaludy opened 6 years ago

nmaludy commented 6 years ago
SUMMARY

I'm proposing implementing a new auth backend called st2-auth-backend-fallback that will allow users to configure fallback auth mechanisms in case primary authentication fails. This capability allows users to still authenticate with StackStorm in case of an outage of the primary authentication mechanism (think LDAP or Keystone).

ISSUE TYPE
USECASE

The primary use case i'm trying to solve for is when StackStorm is utilizing a centralized authentication backend such as LDAP or Keystone and that primary service is down you're no longer able to authenticate with StackStorm. In this case, it would be nice to be able to "fall back" to using another auth backend such as flatfile with a local htpasswd file. This way we can still auth with something like st2admin and continue our work, potentially remediating the LDAP server outage.

IMPLEMENTATION THOUGHTS

The st2-auth-backend-fallback would take advantage and utilize existing auth-backends in its implementation. Existing auth-backends would be installed. The "fallback" auth-backend would be configured to create instances of these backends and then invoke them in a defined order. If auth succeeds in one of the backends, then overall auth succeeds. If auth fails on all backends, then overall auth fails.

auth-bakackend KWargs as YAML (set in the config below, but this allows you to see the structure):

backend_kwargs:
  - backend: st2-auth-backend-flat-file
    kwargs:
      file_path: "/path/to/.htpasswd"
  - backend: st2-auth-backend-ldap
    kwargs:
      bind_dn: cn=user,dc=example,dc=com
      bind_pw: bind_password
      group:
        base_dn: ou=groups,dc=example,dc=com
        scope: subtree
        search_filter: (&(cn=st2access)(memberUid={username}))
      ldap_uri: ldap://ldap.example.com
      use_tls: true
      user:
        base_dn: ou=users,dc=example,dc=com
        scope: onelevel
        search_filter: (uid={username})
[auth]
mode = standalone
backend = fallback
backend_kwargs = [{"backend": "st2-auth-backend-flat-file", "kwargs": {"file_path": "/path/to/.htpasswd"}}, {"backend": "st2-auth-backend-ldap", "kwargs": {"use_tls": true, "group": {"scope": "subtree", "search_filter": "(&(cn=st2access)(memberUid={username}))", "base_dn": "ou=groups,dc=example,dc=com"}, "user": {"scope": "onelevel", "search_filter": "(uid={username})", "base_dn": "ou=users,dc=example,dc=com"}, "bind_pw": "bind_password", "ldap_uri": "ldap://ldap.example.com", "bind_dn": "cn=user,dc=example,dc=com"}}]
Kami commented 6 years ago

Thanks for proposing this idea.

From a high level perspective, it seems reasonable, but after some more through I'm a bit skeptical about it.

My main concern is security (aka when not implemented and configured correctly, it would decrease the aggregate security).

If you have two backends, where the second fallback one is potentially less secure, a bad actor could, with that knowledge, force authentication against second, less secure backend (e.g. by DDoSing a primary backend or similar).

I know security is all about layers (there should also be rate limiting in place, second fallback should also use secure credentials, etc.), but I'm still worried that this second backend would give users false sense of security and in reality, it would actually decrease the overall security.

You could also argue that this increases surface area (and in general anything which increases surface / attack area is usually not a good thing) and also makes it less secure because operator now needs to keep credentials in two backends up to date, etc.

I would argue that we should only support a single auth backend and it's up to the user to set up HA for that backend.

If we do decide to implement this at some point, we need to keep this in mind and be very explicit about the potential negative security implications in the documentation.

There were many scenarios in the past where bad actors abused a similar architecture where there were a multiple ways to authenticate with a service (TODO: need to dig out some references).

nmaludy commented 6 years ago

@Kami I understand where you're coming from, however i'm thinking through some other systems that we work with. Every one of them (except StackStorm) have "local accounts" that can be used as fallbacks in case LDAP is broken. These include:

This "fallback" feature might also allow users to take advantage of multiple centralized auth providers like "LDAP first, Keystone second, etc".

cognifloyd commented 3 years ago

I'm standing up a new stackstorm-ha (k8s) cluster. It is supposed to have ldap-based auth and I can't add st2admin and similar service accounts to ldap. For a cluster that's already up and running, I can create api keys for the relevant service accounts with something like st2 apikey create -u st2admin and then those service accounts only ever auth via api key.

But a fresh st2 install does not have any api keys yet, so I can't tell the jobs created by helm to use such an apikey. So, somehow I would have to create the cluster with flatfile configured in one version of st2.conf, and then update it (ie redeploy everything since the conf file is on every pod) to use the new st2.conf with ldap configured. Essentially, I need the cluster to be up before I can bring the cluster up. Chicken/egg.

If I had fallback auth, then the chicken/egg ordering issue goes away.

arm4b commented 3 years ago

@cognifloyd You can prepare the API keys beforehand and import them in the actual prod deployment. https://docs.stackstorm.com/authentication.html#api-key-migration This is supported by the Helm chart itself too.

cognifloyd commented 3 years ago

Yes, but the apikey load job needs to authenticate with st2 to load the apikeys, so it's a bit of a catch 22.

nzlosh commented 3 years ago

Isn't the auth enable/disable a simple config switch with reload? Why not just disable auth during the api-key deployment then enable it once the key is deployed.

Alternative option, export the mongo collection before hand and import it as part of the deployment which would then depend on mongo security.

arm4b commented 3 years ago

Yeah, all these are workarounds. The fallback auth backend for "local" st2 operation makes more sense to me.