zmartzone / lua-resty-openidc

OpenID Connect Relying Party and OAuth 2.0 Resource Server implementation in Lua for NGINX / OpenResty
Apache License 2.0
978 stars 249 forks source link

Intermittent Failure to Reach OIDC Endpoints #448

Open barrelmaker97 opened 2 years ago

barrelmaker97 commented 2 years ago
Environment
Expected behaviour

When refreshing tokens, the OIDC library should be able to reach endpoints like the discovery endpoint or the token endpoint.

Actual behaviour

OIDC Library fails to reach OIDC provider endpoints, with an error message of "closed"

Minimized example

Example failure with discovery endpoint:

2022/08/27 00:45:47 [error] 1166#0: *399821 [lua] openidc.lua:577: openidc_discover(): accessing discovery url (https://axs.cluster-a.csd61.zone5/auth/realms/AXS/.well-known/openid-configuration) failed: closed, client: 10.128.0.2, server: kong, request: "PUT /media/v1/mediasets/M2-MTAuMTI5LjAuODgsMTAsNzUzMQ/junctions/3002/connections/1002/requestedstatus HTTP/1.1", host: "axs.cluster-a.csd61.zone5"
2022/08/27 00:45:47 [error] 1166#0: *399821 [lua] openidc.lua:1503: authenticate(): lost access token:accessing discovery url (https://axs.cluster-a.csd61.zone5/auth/realms/AXS/.well-known/openid-configuration) failed: closed, client: 10.128.0.2, server: kong, request: "PUT /media/v1/mediasets/M2-MTAuMTI5LjAuODgsMTAsNzUzMQ/junctions/3002/connections/1002/requestedstatus HTTP/1.1", host: "axs.cluster-a.csd61.zone5"
2022/08/27 00:45:47 [debug] 1166#0: *399821 [lua] openidc.lua:1511: authenticate(): session.present=true, session.data.id_token=true, session.data.authenticated=true, opts.force_reauthorize=nil, opts.renew_access_token_on_expiry=nil, try_to_renew=true, token_expired=true

Example failure with token endpoint: lost access token:accessing token endpoint (https://axs.cluster-a.csd62.zone5/auth/realms/AXS/protocol/openid-connect/token) failed: closed

This usually occurs after a few days of a client being logged in. After being logged in for some time, during a token refresh, we see an error like the ones above. This results in other errors in the OIDC process, and the client gets disconnected and redirected to the login page. What is odd is that when looking at the logs from the API Gateway and Red Hat Single Sign On, we can see that the requests to the OIDC endpoints are getting processed and responded to with 200s. The failed: closed seems to happen almost instantly when the request is made.

Configuration and NGINX server log files

We are using the Kong API Gateway, with 3 instances of it being load-balanced between. Because of this, I can paste some of the config we are using below, and the NGINX config is generated based on this. Please let me know if you need more configuration information. OIDC Config:

  recovery_page_path: 'https://axs.cluster-a.csd62.zone5'
  realm: AXS
  access_token_expires_leeway: 30
  discovery: >-
    https://axs.cluster-a.csd62.zone5/auth/realms/AXS/.well-known/openid-configuration
  introspection_endpoint_auth_method: client_secret_post
  scope: openid pca
  introspection_endpoint: >-
    https://axs.cluster-a.csd62.zone5/auth/realms/AXS/protocol/openid-connect/token/introspect
  logout_path: /logout
  redirect_after_logout_uri: >-
    https://axs.cluster-a.csd62.zone5/auth/realms/AXS/protocol/openid-connect/logout?redirect_uri=https://axs.cluster-a.csd62.zone5
  revoke_tokens_on_logout: true

Some NGINX directives

  nginx_worker_processes: "2"
  nginx_proxy_client_header_buffer_size: "16k"
  nginx_proxy_proxy_buffer_size: 8k
  nginx_proxy_proxy_busy_buffers_size: 8k
  nginx_proxy_set: "$session_secret <secret here>" 
  nginx_http_lua_code_cache: "on"
  nginx_http_lua_shared_dict: "discovery 1m"
  nginx_http_ssl_protocols: "TLSv1.2 TLSv1.3"
barrelmaker97 commented 2 years ago

Also, if there is a way to increase the amount of information we get about the failure to reach the endpoint, please let me know. We are already running with debug logging enabled.