[DISCUSSION]: Potential Authentication Architecture

rwinstanley1 commented 4 years ago

:scroll: Description

This issue is to discuss the current design for the authentication solution and potential changes that could be made to streamline the process.

Current Design: safe-haven-user-login-architecture-Current Design

Potential Future Design: safe-haven-user-login-architecture-Potential Architecture

:page_facing_up: Tasks

Include specific tasks (if any) in the order in which they need to be done.

[ ] To have a discussion to decide positives/negatives of each option as well as potential other options/combinations
[ ] Decide on a 'final' design to take forward

JimMadge commented 4 years ago

Summary of Guacamole Design

A single Guacamole VM (Linux) can replace the RDG, NPS and RDS (Windows) servers
Guacamole can support a variety of authentication back ends, here we propose using an external LDAP (KeyCloak)
KeyCloak supports AD federation, which means users could be imported from an existing AD
MFA could be handled by Guacamole (Option 1) or KeyCloak (Option 2)
The Guacamole portal allows admin users to login and change Guacamole settings; an app firewall can be used to restrict access to trusted IP addresses
- We could also use fail2ban to prevent (severely hamper) brute force attempts

JimMadge commented 4 years ago

After some research, it looks like this might not be as straight forward as our proposals above.

KeyCloak does not expose an LDAP server. Instead, keycloak can federate LDAP servers to import users and map LDAP attributes to the KeyCloak user definition model.
To connect to a machine via Guacamole, you must authenticate with Guacamole and have the corresponding connection defined.
- Authentication can happen through OpenIDConnect or SAML, which KeyCloak provides. However, neither of these methods can provide connection data (OpenIDConnect, SAML).
- You can authenticate with Guacamole using LDAP. This does provide connection data.
- Connection data may also be specified in a database.

If we want to store the connection information and users in a single place, LDAP seems to be the best (only?) choice. In this case, we could also use Keycloak for Guacamole authentication through OpenIDConnect which might have some benefits (MFA). However, we would also need to link Guacamole directly to the LDAP server to fetch the connections data.

jemrobinson commented 3 years ago

Here's another option

Authelia as the outside level of authentication requiring a username/password + TOTP
Reads the users from an LDAP server (eg. OpenLDAP - perhaps this Docker image?) which the VMs can also use for user authentication
Have Authelia authenticate with Guacamole using the HttpAuth method (where it sends the name of the authenticated user) and let Guacamole deal with the user <-> connection mappings using a local database

This has the advantage that the only internet-facing piece of infrastructure is Authelia, which is explicitly designed with security in mind.

martintoreilly commented 3 years ago

I'm still quite nervous about us being responsible for the correct configuration, defence and up to date patching of the identity and access management (IAM) service in a VM or container. I recognise we are doing that at the moment with the Domain Controller and NPS VMs and there is an argument that the bundled desired state configuration for the DCs is much more opaque than we would like. However, my instinct here would be to rely on the IAM service provided by each underlying cloud platform but interface with it using a common set of standard protocols (e.g. OAuth 2, LDAP, RADIUS). I know that, at least for Azure AAD, this means more manual configuration than we would like, and I'd like also us to be able to avoid this if possible, or at least mitigate it with e.g. ongoing configuration validation. Thoughts?

JimMadge commented 3 years ago

I like the idea that we would support a standard protocol for authentication/user management (LDAP seems like the best way to have a single authority which can be integrated with an RDP portal, webapps, VMs). This way it can be up to the admins which LDAP implementation they choose to use (including the option to use an existing LDAP).

Including scripts to deploy an OpenLDAP instance feels like a good way to reduce the number of decisions and manual steps anyone deploying a safe haven will have to make.

@martintoreilly What is it about the IAM aspect that makes your nervous about taking over responsibility?

jemrobinson commented 3 years ago

Another point that we should be thinking about is: "how easy is this to port to AWS?". We're getting a lot of requests from organisations who want to take this Safe Haven and deploy it using another provider (usually AWS). Using something that is not tied to Azure would make this easier.

We should also note that the current way that we use Azure Active Directory is extremely confusing for first time deployers. People are not used to using a standalone AAD for authentication that is different from the AAD that is used for deploying infrastructure. Also, deleting an AAD is still quite painful, which makes it difficult to securely delete users personal data (eg. phone number) at the end of a project.

jemrobinson commented 3 years ago

Another thought: if we use OpenLDAP then the LDAP server doesn't need to have any internet access at all, whereas currently our domain controller has several roles (NTP server, DNS server, domain join server, Azure AD connect server) which means that it does need internet access. It's possible that OpenLDAP might actually be more secure!

@martintoreilly / @JimMadge : What do you think about using Authelia as the primary internet-facing entry point for this system? It can interface to an LDAP server (which could be OpenLDAP or Azure AD) for username/password authentication and has inbuilt MFA. It's also specifically designed to be a secure, web-based authentication provider which is not true for eg. Remote Desktop or Guacamole.

It looks like we could do the following:

we create users with username, and known email address
they click on the reset password link to set their own password (using verification through their known email address)
they then set their own second factor which could be TOTP/Yubikey/Duo (an easy way for us to check-off Yubikey support for Tier-3 which we'd wanted anyway).

JimMadge commented 3 years ago

I think simplifying the structure and making the safe haven more modular (users choice of LDAP implementation, stopping one service holding multiple roles) are all big positives. That said, I think the larger task of porting to AWS would be translating all of the scripts to Terraform/Ansible then translating Azure Terraform to AWS Terraform.

I'll have a closer look at Authelia because I feel like Guacamole is a better solution for remote desktop, but as we have talked about before Guacamole/LDAP doesn't solve the problem of how to create users without either sending credentials to them.

Regarding not exposing LDAP to the internet, I think I agree this should be more secure. It sounds like Martin thinks there might be other problems with taking on the responsibility to manage it 'ourselves' but my intuition is that we should be most concerned about anything which can be reached publicly.

JimMadge commented 3 years ago

Possible architecture using,

Guacamole for RDP
Authelia for authentication/MFA/password reset
Any LDAP implementation as a user database

safe-haven-user-login-architecture (1)

Note:

Authelia's authentication flow involving the reverse proxy is explained in the documentation.

rwinstanley1 commented 3 years ago

@JimMadge Is this issue still needed? Or have we moved the discussion past this now?

JimMadge commented 3 years ago

@JimMadge Is this issue still needed? Or have we moved the discussion past this now?

I think this is worth keeping open for when we come to implementing (or not) a new authentication system.

JimMadge commented 2 years ago

Closing as stale. Could revisit in a discussion for v4.

alan-turing-institute / data-safe-haven