Captain-P-Goldfish / SCIM-SDK

a scim implementation as described in RFC7643 and RFC7644
https://github.com/Captain-P-Goldfish/SCIM/wiki
BSD 3-Clause "New" or "Revised" License
122 stars 38 forks source link

Filtration/Pagination does not work as expected #641

Closed jonastik closed 6 months ago

jonastik commented 6 months ago

Since we expect many records we use standard database for storage. So I have disabled auto-filtration feature as noted here.

I disabled it like this on ServiceProvider:

ServiceProvider.builder()
                .filterConfig(FilterConfig.builder().supported(false).build())

and on ResourceType:

ResourceType groupResourceType = resourceEndpoint.registerEndpoint(groupEndpointDefinition);
groupResourceType.setFeatures(ResourceTypeFeatures.builder().autoFiltering(false).autoSorting(false).build());

so I expected that pagination will work by calling following GET endpoint ../Groups?startIndex=1&count=2 but I was surprised that library modified count to 1.

I have debugged it and found out that you calculate effectiveCount as following in RequestUtils:

public static int getEffectiveCount(ServiceProvider serviceProvider, Integer count) {
        if (count == null) {
            return serviceProvider.getFilterConfig().getMaxResults();
        } else {
            return count < 0 ? 0 : Math.min(count, serviceProvider.getFilterConfig().getMaxResults());
        }
    }

so 1 was used from FilterConfig (default value).

I would expect that if I disable auto filtration feature then its related configuration is not used for any calculation. I was forced to set maxResult propery on filterConfig to make it work.

ServiceProvider.builder()
                .filterConfig(FilterConfig.builder().supported(false).maxResults(MAX_LIMIT).build())

Could you please note it on this page more clearly or fix it the code ?

Captain-P-Goldfish commented 6 months ago

This is the expected behaviour.

the value from serviceProvider.getFilterConfig().getMaxResults() is not a default value. It is the maximum number of entries that may be retrieved with a single request. The SCIM-SDK is passing the minimum value of request and config value to the listResources method to make sure that you do not need to handle this yourself. If you configure a maximum value you should not need to check in each request if the value was exceeded by the client-request. Therefore, the SCIM-SDK will do this for you.

I personally reduce the max-value in most cases to 250 per request, which I consider a reasonable paging size. The client can set the paging size to anything below this value for UI displaying purposes but the client is never able to ask for more entries.

jonastik commented 6 months ago

Since I did not explicitly set max results in my case the value was actually defaulted to 1. See DEFAULT_MAX_RESULTS constant in FilterConfig.

I was confused by fact that I disabled some feature of the system, but its configuration has still effect on behaviour. When I disable filtering in the system I would expect that no filtering will be done by library. That the reason I was a bit surprised. But I understand that server always wants to have some kind of protection against clients.

Problem is that we have to know maximum count of all supported clients ahead in order to properly set this value from the beginning. So we don't send back only half of subset. Okta Idp for example uses count 100 according to their docs. But I did not find any info for pagination for Microsoft Entra ID. It seems that they don't use it.

Captain-P-Goldfish commented 6 months ago

Yes, I actually expected that max-results is always set. Maybe I should add a warning if it defaults to 1? Might prevent such problems in the future.

If the ServiceProvider sets Filter.supported=false the list-endpoint cannot be accessed anymore. The autoFiltering and autoSorting features are standalone and can be activated and deactivated explicitly as shown in this example:

{
  "schemas": [
    "urn:ietf:params:scim:schemas:core:2.0:ResourceType"
  ],
  "schema": "urn:ietf:params:scim:schemas:core:2.0:ServiceProviderConfig",
  "id": "ServiceProviderConfig",
  "name": "ServiceProviderConfig",
  "description": "the service providers configuration",
  "endpoint": "/ServiceProviderConfig",
  "urn:gold:params:scim:schemas:extension:url:2.0:ResourceTypeFeatures": {
    "singletonEndpoint": true,
    "autoFiltering": false,
    "autoSorting": false
  },
  "meta": {
    "resourceType": "ResourceType",
    "created": "2019-10-18T14:51:11+02:00",
    "lastModified": "2019-10-18T14:51:11+02:00",
    "location": "/ResourceTypes/ServiceProviderConfig"
  }
}

Problem is that we have to know maximum count of all supported clients ahead in order to properly set this value from the beginning

I am not quite sure what you mean. You can modify the maxResults value during runtime and it will take effect immediately. And SCIM is designed so that the client is able to get the maxResults value from the /ServiceProviderConfig endpoint to be able to work with paging. If the client is always using a fixed value that is not a problem for as long as the client sends the count-value and the maxResults value is higher than the fixed count-value of the client. Each client is always able to select a count value that is smaller than your configured maxResults value.

jonastik commented 6 months ago

I get it now. Thank you. I did not realised that clients get maxResults value from/ServiceProviderConfig endpoint.