wso2 / product-is

Welcome to the WSO2 Identity Server source code! For info on working with the WSO2 Identity Server repository and contributing code, click the link below.
http://wso2.github.io/
Apache License 2.0
742 stars 721 forks source link

Scaling and Performance recommendations for WSO2 Identity Server #15460

Closed ashensw closed 10 months ago

ashensw commented 1 year ago

Is your suggestion related to an experience? Please describe.

Based on different requirements, scaling and performance recommendations for WSO2 Identity Server must be properly evaluated and documented. Need to revisit the existing way of presenting the performance numbers in IS and make the required changes to make it more informative for the following scenarios.

  1. Catering for peak traffic (high concurrency)
  2. Sustainable traffic
    • Benchmark to decide on horizontal scaling
    • Concurrency - no of users who can log in at a given time
    • Response time - time to perform a login
    • How many nodes are required to handle x concurrency within y seconds of added latency from the IS side? (TPS shouldn't be the main indicator as the output of the documents)
    • Load test script should reflect real-world scenarios (should counter user input delay)
    • Improve the performance stats representation to cater to the above scenarios.
Sachin-Mamoru commented 1 year ago

Competitive Analysis

Our initial step was conducting a competitive analysis [1], through which we identified the various aspects that require attention for scaling and recommending performance enhancements for the identity server.

We acknowledged that utilizing more graphical representations would provide better clarity, and metrics such as CPU utilization could assist in identifying bottlenecks. To accomplish this, we can utilize the Apache JMeter dashboard [2], coupled with the Merge Results JMeter plugin [3].

Response Times Over Time - SAML2 SSO Redirect Binding [2 Node 2 Core Deployment]

image image

CPU Utilization Over Time - SAML2 SSO Redirect Binding [2 Node 2 Core Deployment]

The following information can be extracted using the aws cli [1][2] capabilities.

IS Instance 01 - CPU utilization (%)

image

IS Instance 02 - CPU utilization (%)

image

DB Instance - CPU utilization (%)

image

[1] Competitive Analysis [2] https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/US_SingleMetricPerInstance.html [3] https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/metrics_dimensions.html

Sachin-Mamoru commented 1 year ago

Considering only critical path when publishing performance benchmarks for WSO2 Identity Server. Based on feedback from the SA team, we will remove throughput data from the benchmarks as it is not essential for capacity planning. Furthermore, as an improvement, we have introduced a random delay for the related test cases to simulate a real-world scenario. In real-world scenarios, there is usually a delay in the login page when the end user enters login credentials. By incorporating this delay, we can expect the response time to be more reflective of actual scenarios. Incorporated both 3-node and 4-node deployments and produced corresponding performance metrics. Incorporated burst traffic into our performance test plan to showcase how the system handles sudden increases in traffic.

For more information please refer to the mail thread - Presenting our performance results in an optimal way to do capacity planning for our customers

Sachin-Mamoru commented 1 year ago

As per the requested changes the updated performance results were published for the following flows.

image

image

image

image

Next steps:

Onboard following scenarios based on priority

  1. OIDC password grant including user attributes and groups in the id_token
  2. OIDC password grant including user attributes and groups in the id_token and roles in the access token
  3. JWT bearer grant including retrieving user attributes - this could give insights into signature verification performance
  4. OIDC authorization code grant including user attributes without consent
  5. OIDC authorization code grant including user attributes and consent
Sachin-Mamoru commented 11 months ago

We have published performance results for a selected set of scenarios of IS 6.1.0 with the enhanced representation of the performance metrics. We have addressed all the requested suggestions and improved the representation.

Following are the selected set of performance test flows.

  1. Client Credentials Grant Type
  2. OIDC Auth Code Grant Redirect With Consent
  3. OIDC Auth Code Grant Redirect Without Consent (Please note that results added only for this scenario is a sample)
  4. OIDC Auth Code Grant Redirect Without Consent Retrieving User Attributes
  5. OIDC Auth Code Grant Redirect Without Consent Retrieving User Attributes and Groups
  6. OIDC Auth Code Grant Redirect Without Consent Retrieving User Attributes, Groups and Roles
  7. OIDC Password Grant Type
  8. OIDC Password Grant Type Retrieving User Attributes
  9. OIDC Password Grant Type Retrieving User Attributes and Groups
  10. OIDC Password Grant Type Retrieving User Attributes, Groups and Roles
  11. SAML2 SSO Redirect Binding

Additionally, we have identified that in the OIDC Auth Code Grant Redirect With Consent flow, when we request user attributes in the access token or the id token, the AWS RDS database goes to 100% CPU utilization even for 500 concurrency. Currently, we are tracking it in the issue [2] and will analyse it further.

summary-graph.md [1] file where we have provided the comparison performance plots of the tested flows.

[1] https://github.com/wso2/performance-is/blob/performance-graphs/benchmarks/6.1.0/performance_visualization_v2/summary-graph.md

Sachin-Mamoru commented 10 months ago

Finalized Performance Test Flows - 7.0.0 Release

  1. Client Credentials Grant Type
  2. OIDC Auth Code Grant Redirect With Consent - With only random scopes (No user attributes or groups or roles are requested.)
  3. OIDC Auth Code Grant Redirect Without Consent - With only random scopes (No user attributes or groups or roles are requested.)
  4. OIDC Auth Code Grant Redirect Without Consent Retrieve User Attributes, Groups and Roles
  5. Burst Traffic with OIDC Auth Code Grant Redirect Without Consent Retrieving User Attributes
  6. OIDC Password Grant Type - With only random scopes (No user attributes or groups or roles are requested.)
  7. SAML2 SSO Redirect Binding
  8. Token Exchange Grant type
  9. API-based authentication flow [1] (Added to the roadmap)

Finalized summary - https://github.com/wso2/performance-is/blob/performance-graphs/benchmarks/6.1.0/performance_visualization_v2/summary-graph.md

[1] https://github.com/wso2/product-is/issues/17060

Mail thread - Presenting our performance results in an optimal way to do capacity planning for our customers