wso2 / product-is

Welcome to the WSO2 Identity Server source code! For info on working with the WSO2 Identity Server repository and contributing code, click the link below.
http://wso2.github.io/
Apache License 2.0
748 stars 729 forks source link

Extensive database call for SCIM2 Users filter operation #20548

Open deshankoswatte opened 5 months ago

deshankoswatte commented 5 months ago

Describe the issue:

It has been noted that a large number of queries are executed when a SCIM2 multi-attribute filter with identity claims that contain large results is retrieved from the userstore, even though the count value is set to a minimum. Note that this issue should also be reproducible even if the filter does not contain identity claims.

How to reproduce:

A sample request for this would be as follows. Please note that the primary database is a JDBC userstore, and the config consider_max_limit_for_total_results has been disabled.

curl --location 'https://localhost:9443/scim2/Users?count=1&startIndex=0&attributes=userName%2Cemails%2Curn%3Aietf%3Aparams%3Ascim%3Aschemas%3Aextension%3Aenterprise%3A2.0%3AUser%3AaccountLocked&domain=PRIMARY&filter=urn%3Aietf%3Aparams%3Ascim%3Aschemas%3Aextension%3Aenterprise%3A2.0%3AUser%3AaccountLocked%20eq%20%22true%22%20and%20urn%3Aietf%3Aparams%3Ascim%3Aschemas%3Aextension%3Aenterprise%3A2.0%3AUser%3AaccountDisabled%20eq%20%22true%22' \
--header 'Authorization: Basic YWRtaW46YWRtaW4='

Given that the above query returns only one result and the total result count for the filter is 3, I’m continuing the explanation. When the above request is initiated, we can observe that the following queries are executed based on the total result count of the provided query. The first occurrence of this happens when the limit is set to 1 (count). The other two occurrences happens because we get the total results with the maximum username list length.

2024-06-13 10:26:53,640|f89ccf1b-1f44-47c0-958b-38573ec99bc6|https-jsse-nio-9443-exec-4|1|jdbc|1718254613639|executeQuery|SELECT UM_USER_ID FROM UM_USER WHERE LOWER(UM_USER_NAME)=LOWER(?) AND UM_TENANT_ID=?|jdbc:h2:./repository/database/WSO2SHARED_DB

2024-06-13 10:26:53,640|f89ccf1b-1f44-47c0-958b-38573ec99bc6|https-jsse-nio-9443-exec-4|0|jdbc|1718254613640|executeQuery|SELECT UM_DOMAIN_NAME FROM UM_DOMAIN WHERE UM_DOMAIN_ID = (SELECT UM_DOMAIN_ID FROM UM_UUID_DOMAIN_MAPPER WHERE UM_USER_ID = ? AND UM_TENANT_ID = ?)|jdbc:h2:./repository/database/WSO2SHARED_DB

2024-06-13 10:26:53,640|f89ccf1b-1f44-47c0-958b-38573ec99bc6|https-jsse-nio-9443-exec-4|0|jdbc|1718254613640|executeQuery|SELECT UM_USER_NAME FROM UM_USER WHERE UM_USER_ID=? AND UM_TENANT_ID=?|jdbc:h2:./repository/database/WSO2SHARED_DB

In addition to the above, we also execute the following query twice: once when the users are extracted with the count and once with the total result count, similar to the above.

2024-06-13 10:26:53,576|f89ccf1b-1f44-47c0-958b-38573ec99bc6|https-jsse-nio-9443-exec-4|0|jdbc|1718254613576|executeQuery|SELECT UM_USER.UM_USER_ID, UM_ATTR_NAME, UM_ATTR_VALUE FROM UM_USER_ATTRIBUTE, UM_USER WHERE UM_USER.UM_ID = UM_USER_ATTRIBUTE.UM_USER_ID AND UM_USER.UM_USER_ID IN ('625e3fc4-bf5e-4a2a-8493-7e93013fb2c3') AND UM_PROFILE_ID=? AND UM_USER_ATTRIBUTE.UM_TENANT_ID=? AND UM_USER.UM_TENANT_ID=?|jdbc:h2:./repository/database/WSO2SHARED_DB

Expected behavior:

Ideally, we should not query all the user details for the totalResults. We should be able to extract the correct totalResults in a more efficient manner.

Environment information:


sanjulamadurapperuma commented 4 months ago

Hi all,

Currently this issue has been fixed for scenarios which utilize only multiple identity claims in the SCIM2 filter but other flows are yet to be fixed.

Thanks, Sanjula