Hashed claims discovery

jricher commented 8 years ago

In the current protocol, particularly with regard to pushed claims, the AS is going to have some set of policies defined by claims that are required to fulfill those policies. In order to help the client know what to do with a need_info response, the AS is allowed to tell the client some information about what claims are required in the way of telling the client the issuers and claim names. The client doesn't get to find out what claim values are required to fulfill any policies, as this would leak private information about not only Alice but also anyone Alice has created a policy for. However, since the client doesn't know what claim values are required, a well-intentioned client could erroneously present a bunch of claims from the RqP that don't have any chance of fitting the policy but would leak information about the RqP.

A potential alternative would be for the AS to hash the claim values and return the hashes with the claim names. This would allow the client to perform the same hash against any values that it knows about and figure out if it's worth submitting these claims for this policy or not, preventing unnecessary leaks. The hash would ideally be calculated using a randomized salt, such as the current ticket value for the transaction. (See #205 and #239 for information about proper rotation of tickets at each step.)

agropper commented 8 years ago

It sounds like a feature but, even as a 'professional' privacy advocate, I'm at a loss to think of a real-world use-case.

Adrian

On Thu, May 26, 2016 at 3:38 PM, Justin Richer notifications@github.com wrote:

In the current protocol, particularly with regard to pushed claims, the AS is going to have some set of policies defined by claims that are required to fulfill those policies. In order to help the client know what to do with a need_info response, the AS is allowed to tell the client some information about what claims are required in the way of telling the client the issuers and claim names. The client doesn't get to find out what claim values are required to fulfill any policies, as this would leak private information about not only Alice but also anyone Alice has created a policy for. However, since the client doesn't know what claim values are required, a well-intentioned client could erroneously present a bunch of claims from the RqP that don't have any chance of fitting the policy but would leak information about the RqP.

A potential alternative would be for the AS to hash the claim values and return the hashes with the claim names. This would allow the client to perform the same hash against any values that it knows about and figure out if it's worth submitting these claims for this policy or not, preventing unnecessary leaks. The hash would ideally be calculated using a randomized salt, such as the current ticket value for the transaction. (See #205 https://github.com/KantaraInitiative/wg-uma/issues/205 and #239 https://github.com/KantaraInitiative/wg-uma/issues/239 for information about proper rotation of tickets at each step.)

— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub https://github.com/KantaraInitiative/wg-uma/issues/254

Adrian Gropper MD

PROTECT YOUR FUTURE - RESTORE Health Privacy! HELP us fight for the right to control personal health data. DONATE: http://patientprivacyrights.org/donate-2/

xmlgrrl commented 8 years ago

Let's say that Alice wants Bob to prove that he controls the email address bob@gmail.com. In the current trust elevation subprotocol, there's a means for the AS to convey to the client that a claim of the required type with the required issuer is needed to satisfy the policy.

It would be handy for the AS to tell the client what values would be valid, in case the client already has matching values for the requesting party using that client.

However, if the AS were to provide information such as "We're looking for you to supply a claim value of 'bob@gmail.com' for this claim", and it happens to be the case that the requesting party attempting access is actually charlie@live.com, then the client will be able to infer a fact about Alice's policy that they shouldn't have seen (and could even leak that fact to Charlie himself), along with flinging an attribute about Bob around the interwebs.

The method being suggested is a way to allow the client to confirm whether a claim value already in its possession is a match for the claim value being sought, without exposure of the value being sought in the case that they don't match. In this case, if it's Charlie behind the client, then the two hashes won't match and the client will still never have seen the string "bob@gmail.com". But if it's Bob with the right email address behind the client, then the hashes will line up and the client will know it's worth it to push that claim.

(All this is applicable only in the case where pushed claims are being used in concert with the in-band claim requirements need_info stuff, vs. out-of-band negotiations for initial claim pushing. I expect the latter will be done quite often, but if the former is ever to be used, then I suspect we owe that part of the spec a privacy-sensitive solution for this challenge...)

agropper commented 8 years ago

Thanks, Eve. Now I understand.

Adrian

On Friday, June 3, 2016, Eve Maler notifications@github.com wrote:

Let's say that Alice wants Bob to prove that he controls the email address bob@gmail.com javascript:_e(%7B%7D,'cvml','bob@gmail.com');. In the current trust elevation subprotocol, there's a means for the AS to convey to the client that a claim of the required type with the required issuer is needed to satisfy the policy.

It would be handy for the AS to tell the client what values would be valid, in case the client already has matching values for the requesting party using that client.

However, if the AS were to provide information such as "We're looking for you to supply a claim value of 'bob@gmail.com javascript:_e(%7B%7D,'cvml','bob@gmail.com');' for this claim", and it happens to be the case that the requesting party attempting access is actually charlie@live.com javascript:_e(%7B%7D,'cvml','charlie@live.com');, then the client will be able to infer a fact about Alice's policy that they shouldn't have seen (and could even leak that fact to Charlie himself), along with flinging an attribute about Bob around the interwebs.

The method being suggested is a way to allow the client to confirm whether a claim value already in its possession is a match for the claim value being sought, without exposure of the value being sought in the case that they don't match. In this case, if it's Charlie behind the client, then the two hashes won't match and the client will still never have seen the string "bob@gmail.com javascript:_e(%7B%7D,'cvml','bob@gmail.com');". But if it's Bob with the right email address behind the client, then the hashes will line up and the client will know it's worth it to push that claim.

(All this is applicable only in the case where pushed claims are being used in concert with the in-band claim requirements need_info stuff, vs. out-of-band negotiations for initial claim pushing. I expect the latter will be done quite often, but if the former is ever to be used, then I suspect we owe that part of the spec a privacy-sensitive solution for this challenge...)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/KantaraInitiative/wg-uma/issues/254#issuecomment-223487989, or mute the thread https://github.com/notifications/unsubscribe/AAIeYbMg2dWhl8htMUancpMulNGRmh9vks5qH60fgaJpZM4In4Up .

Adrian Gropper MD

PROTECT YOUR FUTURE - RESTORE Health Privacy! HELP us fight for the right to control personal health data. DONATE: http://patientprivacyrights.org/donate-2/

jricher commented 8 years ago

I think it could help inform a client for the out-of-band stuff, since the client could in its UI hint to the user what claims it might need. The AS can do that through its UI though, and might be better at it. Definitely need to think through the privacy considerations of this though.

xmlgrrl commented 7 years ago

This got discussed a bit in the ad hoc continuance of UMA telecon 2017-01-12, though not concluded.

xmlgrrl commented 7 years ago

Per UMA ad hoc telecon 2017-03-06: This could have implications for the whole need for protected discovery and so on. But on the other hand, is it an enabler for rainbow table discovery? So it needs some proper thinking and work. So this gets the close-without-action and extension labeling.

KantaraInitiative / wg-uma

Hashed claims discovery #254