w3ctag / design-reviews

W3C specs and API reviews
Creative Commons Zero v1.0 Universal
321 stars 55 forks source link

Trust Token API #414

Closed csharrison closed 2 years ago

csharrison commented 4 years ago

こんにちはTAG!

I'm requesting a TAG review of:

Further details:

We recommend the explainer to be in Markdown. On top of the usual information expected in the explainer, it is strongly recommended to add:

You should also know that...

We’re still very early stage here, just looking to get TAG review earlier rather than later.

We'd prefer the TAG provide feedback as (please select one):


Please preview the issue and check that the links work before submitting. In particular, if anything links to a URL which requires authentication (e.g. Google document), please make sure anyone with the link can access the document.

¹ For background, see our explanation of how to write a good explainer.

hadleybeeman commented 4 years ago

Hello! @hober and I discussed this at our face to face in Cupertino.

Two main points from us:

  1. What happens if the issuer is a bad actor?

This design only works in the way you've intended if the issuer is properly anonymising and randomising the tokens. What happens if the issuer isn't a trustworthy organisation?

And since the user has no role in selecting the issuer, the user then gets no say in who that might be. If we end up with an ecosystem of dodgy issuers, can the user protect themselves?

It seems like this could be mitigated by an approach like the one in Web Payments, where the browser keeps a set of payment methods that the user is happy with. The shopping site has a list of payment methods it supports. At purchase time the site supplies its options; the browser picks from those. This is a nice quality: the user agent has a role in choosing. This is the role of the user agent.

We recognise that users can't express their preferences on advertisers at all. Could a similar approach work here?

  1. We're concerned about the potential for trust tokens to be used as categories to identify or describe the users.

You've written in the explainer:

The issuer can store a limited amount of metadata in the signature of a nonce by choosing one of a set of keys to use to sign the nonce and providing a zero-knowledge proof that it signed the nonce using a particular key or set of keys.

You say it's a limited amount of metadata: how many bits? Even a small number of bits could be risky with certain bad issuers.

We'll open issues in your github repo; these notes are here so that we have them.

dvorak42 commented 4 years ago

(Continuing thread on the trust-token-api issues).

1) The crypto in this scheme is resilient against a bad actor on either side (preventing token forgery from the client and preventing loss of anonymisation from the issuer). The issuer would only be able to subdivide the users of that issuer based on the presence or absence of the token (and in the private metadata case, the value of that bit of information).

There are some issues that can occur if you are running a large number of issuers attempting to be malicious, where each issuer uses the bit of information they have to divide their userbase via different non-trust related metrics. Having a allow/block list that the UA supports would help mitigate this issue.

2) Depending on the use case, different numbers of bits may be reasonable. For the web anti-fraud use case, there are compelling arguments for having one bit (to avoid the presence of a token from telling a malicious actor they've successfully passed the fraud system/captcha/etc), beyond that each UA would need to consider the privacy/usecase tradeoffs carefully. This may interact with ideas such as a privacy budget

hadleybeeman commented 4 years ago

Also, it looks like your use case might be similar to the Verifiable Credentials work. It would be useful to talk to them and determine if you think this is a competing proposal, or where the overlaps/differences are. @burnburn @stonematt are the chairs.

dvorak42 commented 4 years ago

There's a bit of overlap, but for Trust Tokens, we are only looking to propagate a tiny amount of trust information (1 or 2 bits) and the protocol needs to be resilient against bad actors where the issuer tries injecting more information into the token/claim or do other forms of watermarking/fingerprinting of the token issuance/redemption. Given the breadth of scope of information that a verifiable credentials claim contains and the trust in the credential/claim issuer there, these probably are reasonable to remain as separate proposals with different threat models.

Its possible that the redemption attestation portion of Trust Tokens might be adaptable to look like the Verifiable Credentials, though a simple public key signature scheme works fine for that.

torgo commented 4 years ago

@dvorak42 @csharrison we're just trying to make some progress on this issue. While we're doing that, can you let us know if there have been developments recently on the spec, and especially if there is any information on implementations and use of this? Also it looks to us like there is currently no requirement for asking for user permissions. If this is the case, can you expand on the rationale here? It looks to us like this is a very powerful API that cuts across origins, and that potentially violates the same origin principle. We are concerned that users would not be expecting information from one domain to be available to another domain.

dvorak42 commented 4 years ago

There has been some work on the spec side for the underlying protocol (Privacy Pass) which is going through the IETF standardization process, and as that updates we'll be updating the Trust Token API design. Initial work has begun on implementing this API in Chromium and we hope to run small-scale experiments with it soon to verify the feasibility of this API and whether it is sufficient for use cases that might need it. We currently don't require user permissions, as the capabilities of this API are currently substantially less than for ordinary 3P content within a page which don't require permissions, we'll likely need a new model if we try to move a lot of these 3P-esque capabilities behind permissions as prompting on every new page visit (even just the CAPTCHA case where you'd need to accept a user permission before using the CAPTCHA or be forced through a longer flow) that uses these capabilities would cause user fatigue.

hober commented 4 years ago

See also mozilla's position on privacy pass

hober commented 4 years ago

the capabilities of this API are currently substantially less than for ordinary 3P content within a page which don't require permissions, we'll likely need a new model if we try to move a lot of these 3P-esque capabilities behind permissions

@atanassov and I took another look at this during our Wellington F2F. We had a bit of trouble parsing your comment; could you try to clarify this bit for us? Specifically, when you say "ordinary 3P content within a page which don't require permissions," could you give us a concrete example? You say that "we'll likely need a new model if we try to move a lot of these 3P-esque capabilities behind permissions". If you look at the current browser landscape, do you think it's reasonable to expect "a lot of these 3P-esque capabilities [to move] behind permissions"? That is, maybe the time to look into finding a new model is now?

hober commented 4 years ago

Hi @dvorak42!

@plinss and I took another look at this in this week's TAG F2F, and we're hoping you could answer some of the questions I asked in my last comment.

dvorak42 commented 4 years ago

Sorry, missed the original response.

3P cookies/storage being the current type of content that isn't primarily behind active user permissions. I agree that as the browser landscape moves towards limiting 3P content we need some sort of model, but I'm not sure that using permissions as currently exists is the right approach here. Requiring the user top click through permissions on every page that wants to mitigate fraud/DoS/etc would end up with user fatigue. There's also the question of whether having a new model for these sorts of 3P-esque capabilities should be done on an API by API basis or if there should be a more holistic approach to sorting out how to handle these types of capabilities.

davidvancleve commented 3 years ago

Guten TAG,

Motivated by a likely paucity of tokens available on mobile, we're thinking through ways to expand trust token coverage by supporting on-device token issuance; we'd appreciate expanding the scope of this TAG review to include any more concrete subsequent design for on-device token issuance, too. (Updates to follow in the linked bug, and in edits to docs in the Trust Tokens repository.)

Thanks!

torgo commented 3 years ago

Hi @davidvancleve @csharrison - We're just coming back to this in our virtual f2f. @plinss will potentially open up an issue with you about the tracking potential we see. In the mean time, could you give us a brief update on where things are at with your experimentation using this technology?

dvorak42 commented 3 years ago

Currently we're running an origin trial in Chrome to see whether the signal in a token is enough to be a suitable signal for anti-fraud purposes. We've been reaching out to folks to try getting more participants in the origin trial to see what use cases the API can be useful for, but due to the complexity with spinning up an issuer/redeemer setup, haven't gotten too many external participants running their own code, we're in the process of rolling out demo sites to test the redemption side of the API and a library to support issuers running their own issuer during the OT which will hopefully allow more folks to experiment with the API.

torgo commented 3 years ago

Hi @martinthomson - we are just reviewing this in the TAG f2f this week and we were wondering if there was any updated research or position from Mozilla beyond what Tess pointed to from March of last year?

hadleybeeman commented 3 years ago

Hi @dvorak42 @csharrison. We're just wondering how the origin trial is going? Have you learned anything that is changing your approach?

rhiaro commented 3 years ago

I see in the privacy considerations:

At issuance, we require user activation with the issuing site.

and was wondering if you can go into more detail about what this looks like from the user's perspective?

dvorak42 commented 3 years ago

While we don't have concrete numeric data yet on how effective the API is, the OT and external feedback has indicated that some parts of the API need to have a few more toggles to support various use cases. The largest change has been making the redemption record be a free-form blob that issuers can structure however it most makes sense for specific issuers. This change also introduces the possibility of merging the various redemption flows into one API (the issuer can decide whether or not to return a redemption record, which either matches with the previous 'raw-token-redemption' or 'srr-token-redemption' flows).

We also need to add more explicit support for issuers not necessarily being the first party on an issuing site. For the CAPTCHA use case, the CAPTCHA issuance logic might be embedded in sites as 3P content, and not be the same as the top-level page the user is visiting. Along with potentially optimizing those paths (allow an issuance to also be a redemption in the case that you want a redemption record at the time the issuer is issuing tokens, or you want to use the presence of a redemption record to guide the decision to provide more tokens).

This ties in a bit with the user activation question. The actual mitigation is that we want to have a signal that the user is intentionally navigating to/interacting with the page, rather than this page being loaded in the background or via a long redirect chain through a ton of sites that are issuing tokens. From a user's perspective, the user activation signal is implicit in their use of a web page using the API, rather than an explicit pop up they have to click or prompt they have to interact with.

martinthomson commented 3 years ago

I don't really have a lot to add here. There has been some activity that I haven't been following closely, but I'm not seeing any concrete progress on the truly thorny pieces of this.

Much of the privacy properties of the underlying privacy pass work depend on the client having a clear understanding of what information it is propagating across privacy boundaries. As a generic mechanism, this becomes essentially impossible to validate without knowledge of the application context and the information that is being exchanged. I don't think that we are in any position to say that a generic framework like the one proposed is workable.

There are things that might be OK to enshrine in the platform with only limited safeguards (those safeguards might extend to including explicit consent, though opinions on what is appropriate here differ widely). Steven is talking here about using this for CAPTCHA, in which case the information being carried might be "X believes that this client is not a robot", which is one of the best example applications of this that we currently have. Even there, there are difficult caveats to work through. That includes those issues Steven mentions, but larger questions too.

I haven't seen progress (though, again, not I'm paying close enough attention, sorry) to suggest that the embedding information through the choice of token issuer keys has been adequately addressed, nor the corresponding issue of centralization that the solutions to that problem generally lead to. These are really difficult problems, even for the relatively narrow space of making asserts about the difference between natural and artificial intelligence.

I don't know if the TAG has any established policy with respect to research projects. The IETF is generally careful to identify and avoid projects that include a significant exposure to questions unanswered in science. This is one of those cases where you might be best deferring any concrete resolution until those central questions are answered.

My intent here being not to discourage the research (this could be a really useful technology), but to ensure that it is better understood. Again, if there have been results regarding these questions and I simply missed them, I apologize and hope that Steven or Charlie can enlighten us all. (I will read that work with great interest.)

torgo commented 3 years ago

Hi @csharrison @dvorak42 - we're picking this up again at our virtual f2f this week. It looks like this work is ongoing in WICG. Can you provide any further updates? Any response to @martinthomson's message above? Should we be re-reviwing? If so can you let us know what's recently changed in your design?

dvorak42 commented 3 years ago

We're doing some work in the Privacy Pass IETF working group to try to more explicitly handle some of these issues (being more explicit about the boundaries/contexts operations are being done in, trying to pull in and articulate the centralization concerns to try mitigating them in the solutions/protocol changes).

Generally I agree, that I think the API will need to have more explicit mitigations/safeguards in the use of issuance/redemption in different contexts/origins/etc to protect against cross-site tracking/fingerprinting, rather than being reliant on having an understanding about the sort of information being embedded.

I can write up a doc gathering safeguards and boundaries included to try mitigating some of the cross-site tracking concerns to get a review over that model/framework and related concerns that have come up from the Privacy Pass side.

torgo commented 3 years ago

That sounds great! Let us know when that doc is ready and we can have a look at that point.

hadleybeeman commented 3 years ago

Hi @dvorak42 @csharrison! We're just checking in on this. Any progress on that safeguards and boundaries document? Or is there anything else we can do to be helpful here?

dvorak42 commented 3 years ago

Sorry, missed the message. Not a ton of progress yet. Some of the requisite framing has been merged into the Privacy Pass draft (https://github.com/ietf-wg-privacypass/base-drafts/blob/master/draft-ietf-privacypass-architecture.md#redemption-contexts) and we'll have a quick update with the WG at IETF next week for that, hopefully we can get the Trust Token side document out by mid-August. We're also currently trying to finish up another update to the explainer to help articulate some of the ecosystem/deployment shapes that've turned up, that should hopefully land in the next couple weeks.

dvorak42 commented 2 years ago

As a quick update, we've landed the ecosystem/deployment at https://github.com/WICG/trust-token-api/blob/main/DEPLOYMENTS.md. We expect to have the framework doc published in the next few weeks (taking a bit longer to get everything worked out), and will ping this thread once that's landed.

hober commented 2 years ago

Hi,

As we wait for the updated framework doc, I wanted to make sure we mentioned in this thread that (assuming all of the underlying issues can be worked through) we're interested in declarative integration of trust tokens into HTML forms, see #558 for more.

dvorak42 commented 2 years ago

We've finally landed the privacy framework document (https://github.com/WICG/trust-token-api/blob/main/PRIVACY_FRAMEWORK.md). There are a number of parameters that UAs will need to set based on their privacy model/principles, at some future point they could be tied into a site's privacy budget to allow for more issuances/redemptions to happen on a site if its not using many other privacy budget impacting features.

I've opened up a couple issues for additional ways to trigger the Trust Token API (form-based triggering could help deal with most of the requirements of issue #558), we're also looking at ways that this can be triggered through HTTP headers, potentially via HTTP Auth requirements for visiting a resources.

plinss commented 2 years ago

For tracking purposes: https://github.com/WICG/trust-token-api/issues/88 https://github.com/WICG/trust-token-api/issues/89

hober commented 2 years ago

Hi,

Mozilla's position on Privacy Pass says:

[W]e will defer making a firm position until the protocol and the novel cryptographic primitives it relies on have had more thorough security analysis.

Has there been review from independent cryptography experts? Could you point us to it, if so? If not, are you making any effort to get independent review of the cryptography?

torgo commented 2 years ago

@dvorak42 further question: Where does this work go after WICG?

dvorak42 commented 2 years ago

The crypto and protocols are in the process of getting standardized in the IETF and getting reviewed via CFRG (for the OPRF crypto primitives).

I'm not sure we know what the best home post-WICG, laterally for ecosystem/broader discussion the antifraud CG might be a good for further discussions, for standardizing and moving down the standardization path, not sure what the best home would be.

torgo commented 2 years ago

Ok thanks @dvorak42 – can you address the question raised by Tess regarding take-up and reception of the IETF specs (and the general issue of multi-stakeholder reception)? Also can I encourage you to have a discussion about where this will go after incubation? WebAppSec maybe? I'm just trying to get an idea.

dvorak42 commented 2 years ago

The CFRG spec is on track and seems to have positive support, though the exact parameterization and knobs that will end up getting standardized are still shifting. The PrivacyPass spec recently updated it's charter timeline and focus on specific instantiation of the protocol, we're hoping that the focused approach there might simplify the scope of the work to get more positive signals. On the browser side we've mostly seen experimentation and analysis happening in Chromium and Edge.

Starting up the discussions and seeking advice, but yeah given the nature of the API WebAppSec seems like a potentially good home.

torgo commented 2 years ago

Hi @dvorak42 thanks for the chance to give this important work an early review. We're largely happy with the design and approach. We're still concerned about the multi-stakeholder issue and the dependency on PrivacyPass. We'd like the opportunity to review again when the spec is more concrete. Can you please either open a new issue or ping us and we can re-open this one. In the mean time we're closing this.