OpenLiberty / open-liberty

Open Liberty is a highly composable, fast to start, dynamic application server runtime environment
https://openliberty.io
Eclipse Public License 2.0
1.15k stars 591 forks source link

SAML SSO on Liberty doesn't tolerate affinity break #28211

Open e30532 opened 6 months ago

e30532 commented 6 months ago

Note: I was requested to open this issue after the discussion with a WebSphere architect.

In the following scenario, if the ACS request (/ibm/saml20/defaultSP/acs) and the redirected service request are handled by the same liberty instance, SAML SSO works as expected. However, it's not guaranteed since session affinity sometimes breaks. Actually, if those requests are handled by different liberty servers respectively, the cache of SAML cookie doesn't exist for the redirect URL and it prevent further SAML SSO processing.

  1. initial client(unauthenticated) request to a service URL.
  2. the request is redirected to IdP by SAML TAI
  3. user's login operation at IDP.
  4. after the login, the request is redirected to acs. ACS creates a cache for SAML cookie and redirects the request to the service URL.
  5. the redirected request checks if the cache exists or not. If no cache, the SAML SSO is not initiated. The request for ACS(/ibm/saml20/defaultSP/acs) and the redirected service request are supposed to be handled by the same liberty instance, but it's not guaranteed since session affinity sometimes breaks.

Diagnostic information:

Note: As described in the link below, in the case of tWAS, LTPA token is generated for the ACS request. Thus, LTPA based SSO can be initiated for the redirected service request. https://www.ibm.com/docs/en/was-nd/9.0.5?topic=sign-saml-single-scenarios-features-limitations

The diag trace is available in the internal thread below. https://ibm-cloud.slack.com/archives/C324NP6H5/p1713428806051459?thread_ts=1713172370.822399&cid=C324NP6H5

./Liberty_NG/keycloak/logs/trace.log

[4/18/24 0:45:29:232 PDT] 00000075 SAMLResponseT >  handledWithCookie Entry  
[4/18/24 0:45:29:232 PDT] 00000075 RequestUtil   >  getAcsCookieValueFromRequest Entry  
[4/18/24 0:45:29:235 PDT] 00000075 RequestUtil   >  isAcsCookieInCache Entry  
[4/18/24 0:45:29:235 PDT] 00000075 RequestUtil   <  isAcsCookieInCache Exit  
                                 false
[4/18/24 0:45:29:235 PDT] 00000075 RequestUtil   <  isUnprocessedAcsCookiePresent Exit  
                                 false
[4/18/24 0:45:29:235 PDT] 00000075 SAMLResponseT <  handledWithCookie Exit  
                                 false
[4/18/24 0:45:29:235 PDT] 00000075 SAMLResponseT <  isTargetInterceptor Exit  
                                 false
[4/18/24 0:45:29:235 PDT] 00000075 TAIAuthentica 3   TAI authenticator before SSO do not intercept this request
[4/18/24 0:45:29:235 PDT] 00000075 TAIAuthentica <  authenticate Exit  
                                 AuthenticationResult status=CONTINUE

./Liberty_OK/keycloak/logs/trace.log

[4/18/24 1:11:44:573 PDT] 00000084 HttpRequestMe 1   setRequestURL input [/ibm/saml20/defaultSP/acs]
[4/18/24 1:11:50:478 PDT] 00000084 Cache         >  put Entry  
                                 6VjdavIzxV0IiCRXTDVOM8KGgKbnYrzA=SamlRequest [ Saml20Token:Saml20Token
[4/18/24 1:11:50:478 PDT] 00000084 Cache         <  put Exit 
[4/18/24 1:11:50:482 PDT] 00000084 HttpResponseM 1   Marshalling first line: HTTP/1.1 302 Found

[4/18/24 1:11:50:508 PDT] 00000082 HttpRequestMe 1   setRequestURL input [/SimpleSecureWeb/SimpleServlet]

[4/18/24 1:11:50:531 PDT] 00000082 SAMLResponseT >  handledWithCookie Entry 
[4/18/24 1:11:50:531 PDT] 00000082 RequestUtil   >  getAcsCookieValueFromRequest Entry  
[4/18/24 1:11:50:534 PDT] 00000082 RequestUtil   >  isAcsCookieInCache Entry  
[4/18/24 1:11:50:534 PDT] 00000082 Cache         >  get Entry  
                                 <sensitive java.lang.String@554602ee>
[4/18/24 1:11:50:534 PDT] 00000082 Cache         <  get Exit  
                                 SamlRequest [ Saml20Token:Saml20Token
[4/18/24 1:11:50:535 PDT] 00000082 RequestUtil   <  isAcsCookieInCache Exit  
                                 true
[4/18/24 1:11:50:535 PDT] 00000082 HttpBaseMessa 3   Found WASSamlACS_n1502129058 in cache
[4/18/24 1:11:50:535 PDT] 00000082 SAMLResponseT <  handledWithCookie Exit  
                                 true
[4/18/24 1:11:50:535 PDT] 00000082 SAMLResponseT <  isTargetInterceptor Exit  
                                 true
ayoho commented 6 months ago

Thanks, @e30532. We've been having a discussion on Slack about this and will have a look when we can.

ayoho commented 4 months ago

Hi, @e30532. After discussing with @arunavemulapalli, we believe this is not a release bug and is functioning as expected. There is not an affinity requirement between steps 1-4 listed in your post; that requirement was tracked and removed under https://github.com/OpenLiberty/open-liberty/issues/11796. However we do still have an affinity requirement between the ACS endpoint and redirecting back to the service URL (steps 4 and 5 in your post). The SAML token received from the IdP can be large, so we save it in a local cache before creating the authenticated subject and therefore have an affinity dependency there just as you point out. We would track any work to change that behavior as a new feature request, and it would have to go through internal prioritization.