jelhub / scimgateway

Using SCIM protocol as a gateway for user provisioning to other endpoints
MIT License
176 stars 57 forks source link

Issues with default? behavior for generating "id" value on created resources #97

Closed ZollnerdMSFT closed 1 year ago

ZollnerdMSFT commented 1 year ago

https://github.com/jelhub/scimgateway/blob/e85ef9041d53546d4c1699de0372b4c7788bacc0/README.md?plain=1#L946C1-L946C84

Not sure what the default behavior is as I haven't tried to locate it in the code - but while the value for userName (also referred to as userId here) is maybe allowable in the spec, I'd caution against using it (or encouraging it via making it the default choice, if it is?) versus using a machine-generated globally unique value (GUID, for instance..).

Using userName's initial value to populate id's value could be interpreted to be against this section of the description of the id attribute in RFC 7643 3.1:

This identifier MUST be unique across the SCIM service provider's entire set of resources. It MUST be a stable, non-reassignable identifier that does not change when the same resource is returned in subsequent requests. The value of the "id" attribute is always issued by the service provider and MUST NOT be specified by the client.

Little bit of room for interpretation there, though.

The bigger problem is that if userName is mutable, then you can end up in a trippy situation:

Versus using a GUID generator to populate the initial value, for instance..

jelhub commented 1 year ago

Hi,

I agree with everything mentioned, but unfortunately this is not the real world.

There are a lot of custom endpoints that do not have any globally unique value (GUID), they use "UserID" (SCIM userName) as unique id.

In these use cases, SCIM userName = SCIM id

Anyhow, your plugin decides which SCIM id to be returned, and there are no restrictions at scimgateway.

Regards, Jarle

ZollnerdMSFT commented 1 year ago

I'm not a developer, so understanding code is challenging but not impossible for me.. given that, I haven't checked what the default is in the code. I was told by someone who has implemented scimgateway that this is the default behavior, so if it isn't, please forgive me.

The real world argument doesn't hold much weight in practice, unfortunately. If you think of a standard/specification as a mutual contract between implementing parties, then regardless of whatever challenges may exist in implementing the standard with the data source/application behind the SCIM API, the requirements of the specification need to be met. The existence of many instances of noncompliance with the specification requirements also only serves to weaken the standard.

This specific issue being slightly open to interpretation from the perspective of the language in the standard, the bigger problem really is the interoperability problem with SCIM clients. At small scale the ramifications of this are pretty small, especially if the limitations introduced by using the same value twice are known. From the perspective of someone using scimgateway to allow an application to receive SCIM from a specific source - a specific SCIM client for instance - then there are processes and configuration changes that can be adopted in order to protect against problems like what I described in the original post.

If any change were to be made, it could be to introduce mitigating controls (i.e.: userName = immutable mutability rather than readWrite) that applied only if the userName value also mapped to id. Documentation on the importance of not mapping that if possible, or just disallowing that configuration due to the likely noncompliance with the SCIM spec, could also be options.

For anyone wishing to use this for a wider audience - say for a multi-tenant SaaS app that may have a dozen different SCIM clients calling it from dozens or hundreds of customers/tenants.. this behavior is problematic, sadly. If caught early, the configuration can be changed without harm, but if it isn't caught until there is already significant investment on the given SCIM implementation, then the config change could break existing clients even if it would solve the issue with others.

This is your project, so understandably final decision on whether to take any action is up to you. I'm heavily involved with SCIM and while dealing with an implementer of your package recently someone on my team caught the userName = id problem during that interaction. Prompted by that, I wanted to at least share the feedback, as I am always encouraged to see resources being created to help with (spec-compliant) SCIM adoption.

Cheers,

Danny

jelhub commented 1 year ago

userName=id is not default behavior. Plugin decides behavior.

For some of the example plugins that are included we have:

plugin-loki: userName=id

plugin-azure-ad (config file decides mapping): https://github.com/jelhub/scimgateway/blob/master/config/plugin-azure-ad.json#L113

plugin-ldap (default configured for AD and config file decides mapping): https://github.com/jelhub/scimgateway/blob/master/config/plugin-ldap.json#L137

ZollnerdMSFT commented 1 year ago

Ah - got it. Thanks for humoring my rambling then.