canonical / lxd

Powerful system container and virtual machine manager
https://canonical.com/lxd
GNU Affero General Public License v3.0
4.27k stars 910 forks source link

Auth: OIDC login flow fails in an LXD cluster if different members handle different stages of the flow #13644

Open mas-who opened 1 week ago

mas-who commented 1 week ago

Issue description

The OIDC login will fail in a LXD cluster if different members handle different stages of the flow. This is likely to occur if load balancing is deployed in front of the cluster.

When a user initiates the login flow, the request will reach /oidc/login and a state parameter will be set in the redirect request to idp auth endpoint. Upon successful user authentication, the idp will make a request to the /oidc/callback endpoint and the state parameter will need to be matched (data can be extracted from the state parameter). Since in the scenario where load balancing is active, it is likely that the cluster member that receives the /oidc/login request would be different to the cluster member that receives the /oidc/callback request, causing state to be mismatched which in turn will cause the login flow to fail.

Some potential ideas on how to resolve this issue:

  1. Store the auth state in dqlite so that we can always read out this data when processing a request to the /oidc/callback endpoint.
  2. Embed the member address in the state parameter so that we can extract this information in the /oidc/callback endpoint. If the extracted member address is not the same as the current member address, then we can forward the request to the correct member based on the extracted info.

Steps to reproduce

Test with LXD cluster:

  1. Started oidc flow with a proxy in front, proxy pointing to lxd-node-1
  2. While on the octa login page, change proxy to point to lxd-node-2 of the same cluster
  3. continue login process
  4. on hitting the /oidc/callback of lxd-node-2 I get an error message: "failed to get state: securecookie: the value is not valid"
tomponline commented 1 week ago

@markylaing when you get a moment I'd appreciate your thoughts on how we could approach this one.

Could we encode (& encrypt) the necessary info in the cookied/return URL of the flow such that its available to the other cluster member?

markylaing commented 1 week ago

@markylaing when you get a moment I'd appreciate your thoughts on how we could approach this one.

Could we encode (& encrypt) the necessary info in the cookied/return URL of the flow such that its available to the other cluster member?

I think @mas-who's suggestion of encoding it in the state parameter sounds reasonable. That said, we could also recommend a load-balancer configuration in our documentation to enable e.g. sticky sessions .

tomponline commented 1 week ago

I think @mas-who's suggestion of encoding it in the state parameter sounds reasonable.

Sounds good thanks!