IdentityPython / SATOSA

Proxy translating between different authentication protocols (SAML2, OpenID Connect and OAuth2)
https://idpy.org
Apache License 2.0
197 stars 121 forks source link

Microservices: Unexpected behaviour of Hasher & HashProcessor #437

Open github-canni opened 1 year ago

github-canni commented 1 year ago

Code Version

Satosa 8.2.0 (docker), running an saml2 frontend and oidc backend, proxy from OIDC to SAML

Expected Behavior

Current Behavior

(1) Hasher and HashProcessor, e.g.

(2) Trying to hash an attribute, which was scoped and generated by the processor "primary identifier", return unexpected results:

Possible Solutions

(1) Align Hasher / Hashprocessor outputs with preserved backwards compatibility, e.g. by providing a new parameter to align hash functions (rewrite/align/split internal hashing processes)

(2) There might be a fix for the "primary identifier" processor (which might be the root cause of that issue) - otherwise a fix for Hasher and HashProcessor could be provided

Workarounds

Of course own / modified / custom modules could be written. But usually I try to find a way with default functions.

(1) Use either or, not both. The issue is that Attribute Processor > HashProcessor cannot update/modify subject_id. In the microservice "hasher" the subject_id can be hashed as well among with the attributes. The HashProcessor seem only to be able to hash attributes from the internal map (I might be wrong here and would love to get feedback on that matter).

(2) Do not hash the target attribut produced from "primary identifier" with Hasher microservice but with AtributeProcessor > HashProcessor. There you could use "ScopeProcessor" appending an arbitrary string before hashing with the HashProcessor. This is a workaround as HashProcessor would fail otherwise.

Steps to Reproduce

(1)

  1. Hash an input string with hasher and hashprocessor, using the same input string, salt and algorithm, compare outputs

(2)

  1. Use microservice "primary identifier" with add_scope "issuer_entityid", e.g. producing '2b27510c-6297-311b-7004-ab7113390135https://host/oidc/auth', e.g. let the micrososervice place the generated identifier in attribute 'uid'
  2. Then use Hasher or HashProcessor to hash the 'uid' to see results mentioned above
c00kiemon5ter commented 1 year ago

Hi, I think there is an assumption that those two micro-services should work the same way, but that is not true.

The HashProcessor seem only to be able to hash attributes from the internal map (I might be wrong here and would love to get feedback on that matter).

The Attribute Processor is about processing attributes. The subject_id element corresponds to the SAML Subject/NameID element value, which is not an attribute.

Use either or, not both.

It depends on what you are trying to achieve. Each has its place and its own behaviour. If you want to combine them, you can also do that. It all depends on your use case. It is certain that they don't behave in the same way.

github-canni commented 1 year ago

Thanks for having a look into this.

I'll dig a bit deeper:

This is also related to federational use cases with best practices for identifier migration and support in mind. Also I think this does not explain the mentioned errors related to processing the primary_identifier attribute, described in (2); (2.1); (2.2), this should not happen in the first place I guess.

Example use case (not exactly the same as the one I'm trying to achieve, but for relevance/illustration purposes):

But:

default limitations (without modifications / extensions / custom processors):

At least the second part (2) should be considered a bug I guess (if I did not do anything else wrong)

c00kiemon5ter commented 1 year ago

Identifier migration means that the value that used to be released to some attribute will now be released as the value of one or more additional attributes. Then, after some time, the old attribute may stop being released.

The easiest way to do this is to copy the value of one attribute to the other attributes that should have the same value.

The Hasher and the HashProcessor are different things. They work in a different way. You should not expect them to do the same thing. If they were identical we would not need both.


Having said this, (2) does sound problematic but it is a separate issye. By (2) I am referring to:

(2) Trying to hash an attribute, which was scoped and generated by the processor "primary identifier", return unexpected results:

It looks to me that the primary_identifier plugin has a bug here: https://github.com/IdentityPython/SATOSA/blob/754dcc2/src/satosa/micro_services/primary_identifier.py#L259

instead of setting the value directly, it should set it as the only element of a list.

-            data.attributes[primary_identifier] = primary_identifier_val
+            data.attributes[primary_identifier] = [primary_identifier_val]

Then, both issues mentioned with the Hasher and the HashProcessor should go away. Can you make this change locally and confirm?

c00kiemon5ter commented 10 months ago

Any update on this @github-canni ?

github-canni commented 9 months ago

Thanks for getting back, and sorry for my late reply. I had to rebuild my prior setup as I worked around the issues experienced.

Regarding (2): Hashing a value from the "primary identifier" microservice

Then, both issues mentioned with the Hasher and the HashProcessor should go away. Can you make this change locally and confirm?

I tried it, and the fix seems to work.


Regarding (1): Comparing hash results from Hasher / HashProcessor

Identifier migration

Identifier migration means that the value that used to be released to some attribute will now be released as the value of one or more additional attributes. Then, after some time, the old attribute may stop being released. The easiest way to do this is to copy the value of one attribute to the other attributes that should have the same value.

In terms of "identifier" migration at SPs: As you stated correctly in a reply before, the SAML Subject/NameID element is not an attribute, so we cannot copy from/to this value with default microservices (exception: primary identifier can replace the internal "subject_id", which is not an attribute, and Hasher can hash it).

This consideration for identifier migrations was the origin of the experienced hashing issues:

I wanted satosa to deliver the same hash value: a) as the unique part of the NameID (= the internal subject_id) and b) as a value for a successor attribute (e.g. the attribute pairwise-id)

This is considered best practice to help SPs relying on the older NameID migrate to another identifier attribute seamlessly.

But: Hashing the subject_id with hasher and then trying to get the same hash value delivered as an attribute by reproducing the hashing process with HashProcessor using the same salt will fail as they produce different hash values.

Expectations I think most people will expect that hashing functions with the same algorithm and salt within (what they might consider the "same" application) "satosa" will produce the same result. I would suggest adding some documentation that both work differently and do not produce the same hash result, even under the same conditions (algorithm and salt)

Solutions (for documentation purposes): Because of your fix, the hashing might now be done completely in Hasher (also for the attribute produced by primary identifier, so that the same values become available). This was not possible before, due to the documented issue. Other solutions were mentioned in the initial comment.

Thanks for your work!