Closed kradalby closed 6 days ago
Followup on @micolous comment on https://github.com/juanfont/headscale/pull/2020#issuecomment-2393061100
@micolous I highly appreciate that you have spent time helping me research and get this right, I need some time before I can sit down properly, but I will try to start researching and think about solutions, I've answered some of the parts where I have some ideas already.
I've been poking at this in another branch, based on this PR: https://github.com/micolous/headscale/tree/oidc-takeover-suggestions:
email_verified
field by default, reject accounts that have already migrated with a different sub
, and attempt to match on email (though the match on email won't work per notes below): https://github.com/juanfont/headscale/commit/27856fbf43678bab889f9ec930820a49f0c94815gorm
returned an error. I haven't tried to figure out if there's a way to exploit this. These should be updated similar to https://github.com/micolous/headscale/commit/ee7f917722cb8da3d9d05e7f00730ce8d454c712Consider that a "work in progress", feel free to incorporate things into this PR. 😄
I've now realised a few more problems which make some of my suggestions (including in that branch) infeasible:
User.Email
, because it is unset for all users created before #2020.oidc.strip_email_domain
setting, it would invalidate all your existing user accounts, so we can't use that to migrate accounts.I think the auto-migration will have to be:
strip_email_domain
setting, with the same logic as before #2020.map_legacy_users: false
(the default), panic if the strip_email_domain
option is set explicitly (we don't want to use that if not mapping!)allowed_domains
contain exactly 1 entry if strip_email_domain: true
(the best we can do to lock this down)email_verified
OIDC claim to be true
(it's not much...)map_legacy_users: true
and there's no account with the sub
, attempt to get user by username in the old way (taking into account strip_email_domain
and ignoring accounts where sub
is already recorded).preferred_username
There's still ways this could fail or be exploited, but it's at least something.
I'm replying a bit before I have managed to grasp everything as I am not well versed in these topics but I am trying to learn fast, a couple of notes and some comments:
I've been poking at this in another branch, based on this PR: micolous/headscale@oidc-takeover-suggestions
This pr and the proposed migration, with reintroducing strip_email_domain
sounds like the most sensible way right now.
I dont want neither the migration code, option nor strip email around for long, so I propose that when we get this sorted, we release 0.24.0 with this code, then everyone who uses OIDC should use this release to migrate.
Then main will remove all this code, and 0.25.0 will contain no migration code and represent the state we want to be the new "base" for OIDC, that we then can build upon.
if an account does not have a Headscale password set
Headscale does not have any passwords, I am not sure if I am missing something from https://github.com/juanfont/headscale/pull/2170#discussion_r1791043244
I wouldn't modify the preferred_username in any way – it's intended to be used as-is.
As for policy preferences, Kanidm defaults to using an SPN that looks like an email address (eg: user@kanidm.example.com), but is probably not an actual email address. This can be changed on an application configuration level with kanidm system oauth2 prefer-short-username $client_id, which makes Kanidm return the Person's name field as a preferred_username (eg: user) 1.
That is find I think, an important goal with this work tho (eliminating strip_email_domain
) is to ensure all users "DisplayName" which is used in headscale nodes list
for owners, in headscale users
and in the Policy file must include an @
. This is for future ACL/Policy work.
I dont have any particular opinion if it should be:
Email makes the most sense to me.
Another thought on the level of effort to put into this migration,
It is currently "quite broken" or at least insecure as we have established since it is vulnerable to these kind of email/username takeover attacks.
Based on this, I think we can make the argument that by using the current setup, you are equally at risk compared to a "inperfect" migration strategy. I feel like we can make the argument for using the Username based migration given that admins already trust what is fed into the username currently, and that should be sufficient to trust it to do the migration.
With the addition that strip_email_domain
has to be reintroduced so it can be matched exactly like we currently do.
It sounds to me like the ideal scenario is to get this fixed, and maybe the best is to keep the simple migration, then tell all your users to login again and disable the migration path after it has been completed.
That is find I think, an important goal with this work tho (eliminating
strip_email_domain
) is to ensure all users "DisplayName" which is used inheadscale nodes list
for owners, inheadscale users
and in the Policy file must include an@
. This is for future ACL/Policy work.
The claim stability and uniqueness constraints apply here too. If an OIDC account is renamed or changes email address in the IdP, Headscale needs to be able to track that and continue to apply the same policies.
There's nothing stopping Headscale recording or consuming attributes like email address and display name in appropriate contexts (like user lists). However, it should not rely on those attributes to be stable, or use them to apply access controls.
According to Tailscale's ACL policy docs and grant docs, users can be referenced by:
username@example.com
username@github
username@passkey
I couldn't find anywhere in the docs that mentions if you can change your email address on an account, or the name of a passkey.
However, you can rename your GitHub account. I haven't tested whether someone taking over a renamed account could be used as an account take-over vector, or if that would just create a new Tailscale account (keyed by sub
) and then apply the wrong ACLs/grants to anything using the old name (much like many other references in GitHub).
Tailscale's Entra SCIM sync and Okta SCIM provisioning docs suggest that Tailscale maps SSO users by email address (rather than sub
).
Other parts of Tailscale's docs state that SCIM groups can be referenced by name or ID. They also state that if you rename a group referenced by name, then those references in ACLs or grants will break. This suggests that they're not internally converted to IDs (though this may not be possible, depending on how this is all provisioned).
All of this seems to suggest that Tailscale does not handle user (or group) renames over OIDC or SCIM correctly, though I don't have a Tailscale Enterprise account to verify this.
Headscale should support referencing:
uint64
), eg: 1234@headscale
iss
+ sub
, eg: idm.example.com/97fb4c49-c3b2-4e13-a8b9-cbe4d6525dc5@oidc
That way it wouldn't matter if an OIDC user's account was renamed, changed email address, or even had an email address set – the iss
+ sub
would remain the same.
Also, Headscale would need to:
@headscale
or @oidc
) only with that provider, rather than checking it as an username or email addressIt could also convert any references to an email address in ACLs/policies to a local user ID or iss
+sub
, and then emit policy which only contains one of those. It may be better to use the external identifier where available, because then it may be easier to use with configuration management tools that'll be working with IdP identifiers.
Based on this, I think we can make the argument that by using the current setup, you are equally at risk compared to a "inperfect" migration strategy. I feel like we can make the argument for using the Username based migration given that admins already trust what is fed into the username currently, and that should be sufficient to trust it to do the migration.
Yeah, I don't like it either... but I don't see any other way to let you automatically migrate a pre-#2020 auth database to sub
without re-introducing the same security risks that were there before.
It sounds to me like the ideal scenario is to get this fixed, and maybe the best is to keep the simple migration, then tell all your users to login again and disable the migration path after it has been completed.
It's important to clearly communicate what's happened, give administrators options for remediation, and they can make a choice of what to do next based on their appetite for risk (whether they attempt to auto-migrate, or they want to audit and patch the user database to fix the problems).
If you have a single-domain configuration (or multi-domain with strip_email_domain: false
), and your IdP does not allow user-triggered email address changes, and no users have changed email address, then the bug probably doesn't affect you.
We can't know all of these things for sure, so the "do nothing" option should fail "safe" – that is, no migration at all, only using sub
. That way, the bug is definitely fixed, and an administrator must take explicit action to either potentially re-introduce a vulnerable configuration, or to manually migrate users using known-safe data.
Or maybe there could be an oidc.configuration_version: 2
attribute that you need to set to be able to use OIDC in the new version, and it crashes on start-up (with a reference to the docs about this issue) if that's not set and OIDC support is enabled.
There should be some way for an admin to know which users are pending migration, and how many. That way they know when they can safely turn it off, see how remediation is tracking and who they need to follow up.
ok, I this is how I will approach this based on the information out of hand. I will implement this will a tradeoff between usability, putting constraints on the OIDC operator and documentation:
iss
+ sub
.In the Policy, email will be the identifier for a user. For non OIDC implementations, it will be username@
To get away from the current insecure implementation, we will implement the migration that works as implemented in this PR:
strip_email_domain
so we can correctly match the usernames mirroring the current behaviourI deem this acceptable as a migration because it is no more insecure than the current implementation, and the trade off of getting off this implementation is more important than making it perfect.
It will be possible to opt-out of the migration, but it will be default true
for one version (0.24.0),
then it will be default false
for 0.25.0, and removed completely in 0.26.0 (including the strip email functionality).
New columns can be added to headscale users list
with Email, which will indicated if a user has been migrated so an Admin can track it.
We will document a recommendation of turning off the migration when all users are migrated.
Considering the high level of configurability of OIDC, providers etc, it is not an option for us to implement the vast level of configurability. It is not an option to sacrifice the usability of the policy, and headscale by forcing users to use the full identifier in Policy. We are therefore happy with requiring certain behaviour from the OIDC. If an OIDC cannot be configured to provide an email, then those OIDCs will not be supported.
I think this tradeoff will cover the concerns raised:
We are balancing the user impact and user experience and allow administrators to manage their own risk appetite. It balance the timeline by choosing a simpler migration that is attainable now and will start getting people over to the correct solution.
Does this sound like it should address the concerns @micolous?
Oops, I accidentally pressed "Comment" before I was finished. I will try again soon. 😄
Comments inline, but I've re-ordered some parts to make it a little easier to follow.
OIDC implementation
- The identifier used in the database to identify a user will be
iss
+sub
.- pref username, display name, picture etc will be updated on all logins
Agreed.
- An admin command for manually updating the Email will be implemented
This is a good idea to support non-OIDC logins.
- We will make email required from OIDC
I couldn't find anywhere that Headscale sends emails, so it should not require an email address, except for migrations.
Non-OIDC users are already identified by username@
.
- Email will not be updated beyond the first login, this will be documented
The whole point of having an identity provider is that it is a single source of truth, and that all attributes are updated automatically in all applications, eliminating the administrative burden of maintaining identities across many applications for many users.
- Email will have to be unique, this will be documented
This is akin to requiring email
be a stable and unique claim – exactly the issue we're trying to avoid.
The fundamental issue is that a person's email address can always change. OIDC at least gives a way for an identity provider to signal to an application that it is the same user (sub
) regardless, and it's up to an application to handle that correctly.
An email address could change for many reasons:
(user + service)
gets its own email address (eg: Apple does this). This could even change over time, such as to responding to a specific service's user list being leaked publicly.Preventing email addreesses from being updated (or making it difficult) doesn't make the issue go away – if anything, it makes things worse.
Considering the high level of configurability of OIDC, providers etc, it is not an option for us to implement the vast level of configurability. It is not an option to sacrifice the usability of the policy, and headscale by forcing users to use the full identifier in Policy. We are therefore happy with requiring certain behaviour from the OIDC.
I acknowledge that using a sub
isn't the friendliest thing to use in policy – but it's also the only thing that's actually stable, which means it's also the only thing that's actually secure.
I also acknowledge that this stems from a Tailscale design flaw, but there is some latitude for Headscale to correct this issue, as it already does for the local case (eg: username@
).
If non-OIDC users can be renamed, it'll have to be handled there too anyway.
Headscale could support unstable identifiers (like email
or preferred_username
) and translate them into stable ones based on its user database, throwing an error if none match or there is more than one match.
This would make an email
or preferred_username
change automatically propagate to any existing policy without the user having to be aware of it, and uses a familiar paradigm in a fairly-safe way.
Local filesystems on UNIX-like systems already do something similar – if you chown user file
, file
's owner will be set to user
's UID at the time of execution, not their username. If user
is later renamed, the UID stays the same, and the file's effective ownership will be retained. When you run ls -l file
, it translates the UID back to user
's new username. Only things which reference user
by username (of which there are many) break.
My understanding is that Headscale doesn't even support these policies today. While it's important to acknowledge these potential later issues, this can be a later problem. 😄
Migration
To get away from the current insecure implementation, we will implement the migration that works as implemented in this PR:
- Re-add
strip_email_domain
so we can correctly match the usernames mirroring the current behaviour
Sounds good.
- If the user has already been migrated, create a new user.
Assuming that on first login for migrating users, the user's full email address is recorded in Headscale's user database (ie: fixing strip_email_domain: true
); the way we know an account is already migrated is if:
An account has already been migrated with the same full email address in Headscale's user database.
strip_email_domain: true
, and an account with the same "user" part has already been migrated (eg: user@example.com
and user@example.net
)
If both cases are handled as "create new account":
If a user already exists with the same full email address: then it suggests two IdP accounts could have the same email address now, or an email may be reassigned to a different user at some point in time.
A mis-matched domain with strip_email_domain: true
suggests that two users could have been sharing the same Headscale account pre-migration.
This could be flagged for the administrator to take action.
Automatically locking both accounts could be used as a denial of service vector, if Headscale is configured to use a public identity provider which erroneously allows public logins.
That all said, creating a new user is still a reasonable answer. It just contradicts the prior stated constraints.
I deem this acceptable as a migration because it is no more insecure than the current implementation, and the trade off of getting off this implementation is more important than making it perfect.
I agree with you it's important to move forward in some way on this, and that there is no perfect solution.
It's important that an administrator can see what the automated migration process has done, any exceptional circumstances it has detected, and manually reassign things if needed.
It will be possible to opt-out of the migration, but it will be default
true
for one version (0.24.0), then it will be defaultfalse
for 0.25.0, and removed completely in 0.26.0 (including the strip email functionality).We will document a recommendation of turning off the migration when all users are migrated.
What happens if an administrator skips v0.24.0 or v0.25.0, and there are other critical bug/security fixes in the interim?
As much as I dislike having the option around any longer than needed, I think migration should be decoupled from point releases, and be "opt in":
The risk with "opt in" migration is that it'll block OIDC logins for existing users until the administrator changes a flag.
But this means that an administrator is definitely aware a migration is taking place, and they are empowered to choose whether they want to migrate using an automatic-but-imperfect process (probably majority) or manually with verified data sources (probably a minority).
It also means that new installations don't require email addresses at all, and won't have this issue from day 1.
The risk with "opt out" migration is that it's possible for the migration to take place without the administrator even being aware of it.
Then "suddenly" some point release later, any user who did not log in while the flag was default-enabled will be locked out.
"Opt in" is the only option which empowers administrators to make a choice, because "opt out" is not consent.
New columns can be added to
headscale users list
with Email, which will indicated if a user has been migrated so an Admin can track it.
Sounds good.
So in summary, I'd suggest something similar to what you had, with a few tweaks:
iss
+ sub
.
email_verified != true
, then Headscale disregards any email
claim.email
claim (and scope):
iss + sub
strip_email_address: true
, the "user" part must also be unique until the account is migrated.email
scope.email
claim is present, it will always update Headscale's user database when changed.Other claims should be update Headscale's database when changed, and are not treated as unique:
preferred_username
name
, the user's preferred name for display in UI elementspicture
, the user's preferred avatar for display in UI elementssub + iss
):
email
, preferred_username
):
Same as you had it, but only "opt in" migration (for reasons mentioned in my previous comment).
This implementation and migration strategy will:
[^1]: when auto-migration is disabled
@micolous I've pushed the new migration, probably need some tidying but please take a look.
https://github.com/juanfont/headscale/pull/2170#issuecomment-2404643862, but only "opt in" migration (for reasons mentioned in my previous comment).
There will have to be one version with it opt out so people who dont read changelogs and then have a bunch of users created run down the issue tracker. A compromise could be a check at start up which sees if all users have oidc identity and automatically turns it off.
I will look at the Policy resolving after i get back from a coffee, it needs to go in at the same time I believe.
@micolous Can you take a look now?
I think I have implemented the migration and info as discussed in this PR.
Then I have split "resolve" users from ACL into https://github.com/juanfont/headscale/pull/2205 and I would appreciate if you can take a look that it covers the concerns.
Thinking about things some more, enabling migration should have no effect on the security of new Headscale installations where every user has an iss
and sub
recorded. I like the idea of making you choose – but I can also see the support angle for this.
I still need to have a look at the code, but I've attempted to reword the change log (and ended up spending quite a long time on it). What's there ("domain is always part of the username for OIDC") is not always true, and some bits are misleading.
I think writing the change log first will make sure we agree on the goal. 😄
I suspect #2205 will block some of what goes into this changelog – but I'll focus on OIDC for now.
Headscale v0.23.0 and earlier identified OIDC users by the "username" part of their email address (when strip_email_domain: true
, the default) or whole email address (when strip_email_domain: false
).
Depending on how Headscale and your Identity Provider (IdP) were configured, only using the email
claim could allow a malicious user with an IdP account to take over another Headscale user's account, even when strip_email_domain: false
.
This would also cause a user to lose access to their Headscale account if they changed their email address.
Headscale v0.24.0 now identifies OIDC users by the iss
and sub
claims. These are guaranteed by the OIDC specification to be stable and unique, even if a user changes email address. A well-designed IdP will typically set sub
to an opaque identifier like a UUID or numeric ID, which has no relation to the user's name or email address.
This issue only affects Headscale installations which authenticate with OIDC.
Headscale v0.24.0 and later will also automatically update profile fields with OIDC data on login. This means that users can change those details in your IdP, and have it populate to Headscale automatically the next time they log in. However, this may affect the way you reference users in policies.
Headscale v0.23.0 and earlier never recorded the iss
and sub
fields, so all legacy (existing) OIDC accounts from need to be migrated to be properly secured.
Headscale v0.24.0 has an automatic migration feature, which is enabled by default (map_legacy_users: true
). This will be disabled by default in a future version of Headscale – any unmigrated users will get new accounts.
Headscale v0.24.0 will ignore any email
claim if the IdP does not provide an email_verified
claim set to true
. What "verified" actually means is contextually dependent – Headscale uses it as a signal that the contents of the email
claim is reasonably trustworthy.
Headscale v0.23.0 and earlier never checked the email_verified
claim. This means even if an IdP explicitly indicated to Headscale that its email
claim was untrustworthy, Headscale would have still accepted it.
When automatic migration is enabled (map_legacy_users: true
), Headscale will first match an OIDC account to a Headscale account by iss
and sub
, and then fall back to matching OIDC users similarly to how Headscale v0.23.0 did:
strip_email_domain: true
(the default): the Headscale username matches the "username" part of their email address.strip_email_domain: false
: the Headscale username matches the whole email address.On migration, Headscale will change the account's username to their preferred_username
. This could break any ACLs or policies which are configured to match by username.
Like with Headscale v0.23.0 and earlier, this migration only works for users who haven't changed their email address since their last Headscale login.
A successful automated migration should otherwise be transparent to users.
Once a Headscale account has been migrated, it will be unavailable to be matched by the legacy process. An OIDC login with a matching username but non-matching iss
and sub
will instead get a new Headscale account.
Because of the way OIDC works, Headscale's automated migration process can only work when a user tries to log in after the update. Mass updates would require Headscale implement a protocol like SCIM, which is extremely complicated and not available in all identity providers.
Administrators could also attempt to migrate users manually by editing the database, using their own mapping rules with known-good data sources.
Legacy account migration should have no effect on new installations where all users have a recorded sub
and iss
.
When automatic migration is disabled (map_legacy_users: false
), Headscale will only try to match an OIDC account to a Headscale account by iss
and sub
.
If there is no match, it will get a new Headscale account – even if there was a legacy account which could have matched and migrated.
We recommend new Headscale users explicitly disable automatic migration – but it should otherwise have no effect if every account has a recorded iss
and sub
.
When automatic migration is disabled, the strip_email_domain
setting will have no effect.
Headscale now uses the standard OIDC claims to populate and update user information every time they log in:
Headscale profile field | OIDC claim | Notes / examples |
---|---|---|
email address | email |
Only used when "email_verified": true |
display name | name |
eg: Sam Smith |
username | preferred_username |
Varies depending on IdP and configuration, eg: ssmith , ssmith@idp.example.com , \\example.com\ssmith |
profile picture | picture |
URL to a profile picture or avatar |
These should show up nicely in the Tailscale client.
This will also affect the way you reference users in policies.
Thinking about things some more, enabling migration should have no effect on the security of new Headscale installations where every user has an
iss
andsub
recorded. I like the idea of making you choose – but I can also see the support angle for this.
That makes sense.
I still need to have a look at the code, but I've attempted to reword the change log (and ended up spending quite a long time on it). What's there ("domain is always part of the username for OIDC") is not always true, and some bits are misleading.
I think writing the change log first will make sure we agree on the goal. 😄
I've included your well-written changes, thank you, I think it covers it and what I have done, we can review and comment on that as part of this PR.
I suspect #2205 will block some of what goes into this changelog – but I'll focus on OIDC for now.
Lets merge this first, then we adopt it to #2205 as part of the process there, while we are improving that in parallel.
Thank you, and let me know how the code review goes.
OIDC users by iss + sub, eg: idm.example.com/97fb4c49-c3b2-4e13-a8b9-cbe4d6525dc5@oidc
Do you plan to implement support for using multiple OIDC issuers (which practically means multiple OIDC providers) simultaneously? If not, then iss
is unnecessary and just makes the user identifier impractically long (e.g. for using in ACLs).
Please let the administrator decide what they want to use as the user identifier. Now I mapped the username (centrally managed and so unique within the organisation) to sub
(in Keycloak), so I can use normal user identification in the ACLs (just to find out that it still uses email, but I assume that it’s not the final design). It would be even better to make the claim used as the user identifier configurable (for admins who don’t control their IdP). It’s the admin’s responsibility to choose an appropriate identifier (which depends on the organisation’s policy) and make sure that it’s unique and immutable.
BTW, why have an email address in the first place? And why email verification? Is it just for the migration process from the previous OIDC implementation? Headscale just needs some user identifier. Maybe the user (full) name (in a free form) for a nicer UI. Collecting and storing personal data is regulated by law (at least in the EU), so it might even be problematic for some users.
In my opinion multiple OIDC provider will be a must in the short time, considering how zero-secrets workload identification is evolving, for example with GitHub Actions OIDC provider.
Please let the administrator decide what they want to use as the user identifier. Now I mapped the username (centrally managed and so unique within the organisation) to
sub
(in Keycloak), so I can use normal user identification in the ACLs (just to find out that it still uses email, but I assume that it’s not the final design).
@jirutka Your installation now makes it impossible for username and email changes to take effect – exactly the issue we're trying to fix.
Name changes happen, and failure to handle that properly is an ethical problem.
As an anecdote: I know of a very large technology company where username changes were possible, but caused you to be locked out of all systems for 1 - 2 weeks while the change propagated to all systems, often through manual intervention. People would plan to change teams (which would result in ACL reassignments anyway) or take vacation when they wanted to change username.
This was because many of the company's systems primarily worked with usernames, email addresses or UPNs, rather than a sub
-like identifier.
It would be even better to make the claim used as the user identifier configurable (for admins who don’t control their IdP). It’s the admin’s responsibility to choose an appropriate identifier (which depends on the organisation’s policy) and make sure that it’s unique and immutable.
With respect: administrators consistently shoot themselves in the foot on this one. Mapping username to sub
is a great example of that.
The OIDC specification is very clear about what is and isn't a stable identifier. It also states what a sub
should look like:
REQUIRED. Subject Identifier. A locally unique and never reassigned identifier within the Issuer for the End-User, which is intended to be consumed by the Client, e.g.,
24400320
orAItOawmwtWwcT0k51BayewNvutrJUqsvl6qs7A4
. It MUST NOT exceed 255 ASCII [RFC20] characters in length. The sub value is a case-sensitive string.
If your sub
looks like jirutka
or micolous
, then you're doing it wrong.
Instead, lets focus on the goal: to be able to assign policies based on a familiar identifier.
I already proposed a mechanism by which an administrator can continue to reference a user by a familiar-yet unstable identifier, and have it automatically converted into a policy which uses a stable identifier.
That will require some more work to make it happen, which won't land in this PR.
Headscale's previous use of the email
claim mimicked Tailscale's design. Unfortunately, Tailscale's design does not appear to consider this issue.
It will take many steps to fully unravel and address this design problem in a safe and secure way. 😄
In my opinion multiple OIDC provider will be a must in the short time, considering how zero-secrets workload identification is evolving, for example with GitHub Actions OIDC provider.
While multiple OIDC providers probably won't be implemented in the short term, I agree that it's important to consider this in the present design, so that we don't need another migration process to unlock that functionality. 😄
With User.Name no longer being a unique constraint (which is good!), there are a number of other issues this opens up:
I believe all of these have been addressed now. They will still resolve from username to an ID, but fail if there is more than one user.
I have filed https://github.com/juanfont/headscale/issues/2246 as a followup to add ID flags to all CLI commands as this PR is already big enough.
This commit hardens the migration part of the OIDC from the old username based approach to the new sub based approach and makes it possible for the operator to opt out entirely.
Fixes #1990