credential-handler / credential-handler-polyfill

Credential Handler API polyfill
https://chapi.io
BSD 3-Clause "New" or "Revised" License
36 stars 13 forks source link

Allow installHandler to include a url to open after success #33

Open brianorwhatever opened 1 year ago

brianorwhatever commented 1 year ago

I'm not sure if this feature would be valuable to others but here is the scenario I was looking at: A user wants to navigate from their wallet to an issuer's website that is linked within it. They do not yet have CHAPI installed so there is 2 possible strategies for the wallet to allow this.

  1. The wallet has a button that installs the handler and then after success a second button to navigate to the issuer (non ideal UX)
  2. The wallet has a button that installs the handler, which subsequently navigates to the issuer (doing this right now as two separate function calls and using a _blank navigation results in the browser blocking the page thinking it's a popup)

I am hoping there is a way we could add to the installHandler method an optional page to open after success. I'm not 100% sure this would solve the issue even but if so would allow the smoother user experience of option 2

dlongley commented 1 year ago

To consider this, we need to model (write down) how we would expect this to work for all major browsers and the different (1p vs. 3p) modes we have to support. It's important to understand what popup windows would be opened when and where "transient user activations" will be consumed. Browsers generally gate various APIs on such a user interaction and "consume" the activation for each API call, meaning you need one user activation per call.

In 3p-mode browsers, calling installHandler will consume any user activation if a call to the Storage Access API is made. In the current implementation, I believe we've totally abandoned trying to use that API for a variety of reasons, but it's good to note this in case we have a need to use it in the future. Regardless, in this mode there is no additional popup window so another user activation would be required to open a new popup window if that's what's desired here.

In 1p-mode browsers, calling installHandler will, today, open an iframe that needs another user activation to open the popup window to show the mediator's UI in a 1p window so it can access the list of registered credential handlers. This extra call was needed for backwards compatibility and could perhaps be removed in the future (unknown). Once the 1p window is open, it is potentially a window that could be reused for various navigations away from the credential handler provider site (avoiding the need for another window and another user activation), but the current implementation may depend on detecting navigations away from the wallet / dealing with close events to ensure security or privacy features (or similar). I don't recall. So there may be some significant complexities or barriers to enabling this to work.

There are probably more details here to work through to determine the viability of this feature.

brianorwhatever commented 1 year ago

Ok, thanks for the great explanation. Can you clarify for me what 1st vs 3rd person mode means in this context? I think I've seen 2 different interaction patterns when I tested using an app's in app browser, is that 3rd?

dlongley commented 1 year ago

So 1p means "first party" and 3p means "third party".

Here, first party refers to the origin that a user directly visits via the URL bar and third party refers to an origin the user may interact with via embeds (usually an iframe) in the first party website.

Browsers will implement or expose a number of features very differently depending on which of these contexts the user is interacting.

For example, in order to prevent cross-site tracking, a number of browsers will "partition" local storage (including storage APIs for localStorage, IndexedDB, cookies, caching, etc.) in third party contexts.

That means that if you visit site A and interact with an iframe that loads site B, the Web application loaded from B will "see" a storage bucket for the combined sites A+B. You can think of this as a sort of double-key'd Map. If the application writes something to local storage in this scenario, like a cookie or using an API like localStorage, etc., then it will only be readable when the user is using an embed of B on site A.

If the user visits site B directly (e.g., by typing its origin into the URL bar), the written value will not be present when an attempt is made to try and read it. The same is true in reverse -- values written by a Web application on site B in a "first party context" will go into a different storage bucket and not be seen when visiting site B via an iframe on site A.

This approach prevents some unwanted tracking behavior by, for example, ensuring that if there's a tracker loaded into an iframe on site A (potentially even invisibly), it cannot create a value that can then be read back on some other site, let's say C, if it embeds an iframe for the same tracker origin. This is what it means to have "partitioned storage" -- every different scenario (site B on site A, site B directly, site B on site C, so on) gets a different storage bucket.

Some browsers, like Brave, have partitioned storage like this -- and -- it's ephemeral. So it gets destroyed quite frequently, perhaps even when you just leave the site (this is up to the manufacturer's policy which can change at any time). Other browsers persist partitioned storage, but perhaps only for a short period of time. Some browsers, like Chrome, do not have partitioned storage and use the same storage bucket everywhere. This can also be controlled by settings -- and the defaults for Chromium-based browsers may differ based on the OS you're using. Chrome (and Chromium generally) are trying to be able to change their default everywhere to use partitioned storage, but can't do so yet without breaking a lot of valid use cases (i.e., not unwanted tracking) on the Web.

Storage Access API

Additionally, some browsers implement the "Storage Access API" -- to try and help enable use cases where a user actually wants some service to keep track of things for them. This API allows, so long as the user interacts with an iframe on a page, the 3rd party Web application running in that iframe to request to use the 1st party storage bucket. The user may or may not be prompted by a browser-rendered dialog when this happens ... depending on whether you're using, for example, Firefox or Safari. Whether or not they render that dialog can be based on whatever the browser manufacturers feel is the best policy, and it's open to change at any time.

Some browsers, like Safari, will require the user to have visited whatever site is rendered in an iframe in a first party context before using the iframe. It also requires that the Web application at that time have received transient user interaction coupled with writing a cookie to the 1p storage bucket -- before the Storage Access API can ever be called successfully in a 3p context (iframe) later. Additionally, the storage access and persistence is highly constrained, limiting the storage of cookies to only seven days, if they have been set by JavaScript as opposed to the server. This is problematic for applications (such as the CHAPI polyfill mediator!) that are only intended to do everything on the client. Furthermore, Safari does not provide shared storage access to anything other than cookies, so localStorage and IndexedDB are non-options using this API.

Some browsers, like Brave, have partitioned storage -- but do not implement the Storage Access API and currently say they do not plan to.

In other words -- all of this gets really complex :).

Something to note about using the Storage Access API in the context of CHAPI -- we need user activation for both the CHAPI mediator polyfill (to access the registered credential handlers) and then we'll almost always need it again at the credential handler (digital wallet) provider site -- so the digital wallet can access local storage. That is messy on its own, but asking the user to click even more times to open popup windows to visit the mediator site in a 1p context ... and then asking them to click again to interact with it to store a cookie is just not workable. This can even get worse if you just need to check whether you've got storage access (to the 1p bucket) in the first place.

So, we've found that for the use case we have with the credential handler mediator polyfill, the Safari implementation of the Storage Access API is essentially unworkable. The whole point of the mediator polyfill is to invisibly store the credential handler registrations that the user wants stored -- without putting some "special mediator storage site" in their face or requiring them to visit it on their own. We also don't want any cookies going anywhere if we can avoid it -- all storage should be local. Given this problem and other complexities we encountered when we tried to make the Storage Access API work in the past, we've opted to just use 1p windows on browsers with partitioned storage.

In short, that means creating popup windows so that the user is visiting a window in a 1p context (with the origin in the URL bar), as opposed to using an embedded iframe in the site they were on. You'll notice that if you use CHAPI in Chrome, you'll get a more integrated and better UX on desktop than if you use CHAPI in another browser -- because that other browser will generate popup windows to ensure that the same storage bucket (always the 1p one) is used. On mobile, the problem is less evident because popups fit more seamlessly into the UX on a small screen where only one window is ever shown.

Getting all of the above to work properly -- and switch properly -- across different browsers is quite challenging. Additionally, there are other complexities around message passing and performance that we have to deal with. This means that we actually always use iframes that minimally load the mediator site (authn.io) on relying party websites. This iframe is always present so when a call is made to CHAPI, we can quickly receive it and decide whether to open an additional 1p window (or not) to process the request. Otherwise, there would be a (user experienced) delay in loading the authn.io mediator site when the API was called.

EDIT: I should note that some have said that using the terms 1p and 3p are problematic (and they are right) and more specific references should be made in specs (I agree), but they are so ingrained and simple to use when discussing common problems and how browsers work that it may be a while before some different or more precise language is adopted in conversation.

brianorwhatever commented 1 year ago

This is a very good explanation and I appreciate you spending the time to educate. It definitely shows how this gets complex when supporting multiple browsers while also creating a seemless user experience. Thanks!

johannhof commented 1 year ago

Hi @dlongley, thank you for writing up your challenges with the Storage Access API. Trying to summarize the main challenges:

I think it would be great if there could be support through FedCM as outlined in https://github.com/fedidcg/FedCM/issues/374 but I'd still be interested in these challenges with SAA (I think they also have implications on the utility for FedCM).

dlongley commented 1 year ago

@johannhof,

Thanks for your response!

  • Need user interaction twice (this seems like it should be possible because SAA should not consume the user interaction, unless you found a bug)

Well, when attempting to make this work with Safari, we end up needing more than just two user interactions. You may be referring to a specific API interaction, e.g., whether a call to SAA to check for current storage access followed by one to request it may not consume a user gesture / activation -- or something along those lines.

I don't know where the behavior is with the SAA these days, but to give some historical background, we've had the CHAPI polyfill for something like seven years now -- and have adapted it over time to whatever browser primitives offer the best UX. This includes implementing several different variants against the SAA over the past two years or so. I'm sure a number of things have changed with how user interactions are consumed (or are not) in that time period, but the totality of what the user must go through still certainly requires more than two on Safari. These challenges ultimately led us to dropping trying to use it. I detail more of that later.

  • Requires prior 1P user interaction (currently only on Safari). This part is slightly confusing to me as I would assume that the user should have previously registered their identity in some way in order for you to use it in an embedded context. Does the user never visit your site in a 1P context? Apologies for my lack of knowledge of CHAPI mechanics :)

TL; DR: Users directly interact with digital wallet websites, not the polyfilled "mediator website", authn.io, that needs to store the origins of registered digital wallet websites. This "mediator" component is supposed to function as a seamless "part of the browser", not as a website that users intentionally visit as a first party outside of any CHAPI-related flow. SAA wasn't really designed for the CHAPI mediator polyfill use case and it doesn't work well as a solution to it.


So, if you're still reading, here's the long version:

The CHAPI polyfill provides two components to enable the "Credential Handler API" to work properly.

The first is provided via a JavaScript polyfill library that other sites (both relying parties and credential handler providers, aka digital wallet providers) include in their Web applications. This library polyfills the navigator.credentials.get() and navigator.credentials.store() methods to all for a new WebCredential type to be requested, i.e., a Credential that is provided by another Web application (a "digital wallet") instead of the browser directly.

When a call is made to get or store such a credential, a message is sent, via postMessage to an iframe that loads code from the polyfilled "Credential Mediator", the second component. This component is intended to be part of the browser -- not having any "site" of its own, but provides local storage for the user's registered credential handlers, aka "digital wallets". To polyfill this component and ensure a user's registrations are available on any other site, however, we have to run code in polyfill mediator website, authn.io.

The mediator is responsible for:

  1. Storing the registered credential handlers (i.e., digital wallet origins) that a user has previously registered, in local storage. This storage needs to be accessible in the mediator from any relying party website that makes a request.
  2. Showing the mediator UI to allow users to authorize a digital wallet to get / store credentials on their behalf. This UI is shown at registration time when a digital wallet website calls the registration API.
  3. Showing the mediator UI to allow users to select a digital wallet to fulfill the get / store request. This UI is shown when a relying party requests to get or store a credential. Relying parties may make suggestions of digital wallets for the user to use if they haven't pre-registered any, but it's also important that the user can "bring their own" via 2.
  4. Opening an iframe or window to the digital wallet website and passing a get or store request to it so that the digital wallet can render its own UI for the user to interact with. The user uses this digital-wallet-specific UI to complete the get or store request.

Note that this "mediator" component is never meant to be a website that the user visits directly, it is polyfilling a missing piece of the browser. The whole point is for it to function as if it provides browser-native selection menus and local storage of the user's registered digital wallets. Also note that the user registers their digital wallets by visiting the digital wallet website(s) of their choice -- not by visiting the polyfill mediator "authn.io" website.

This means that, in order to use the SAA with Safari previously, the UX for registering a digital wallet was like this:

  1. The user visits https://digitalwallet.example. The website loads the polyfill, which includes creating an invisible iframe to the mediator site: authn.io, preparing it to receive any messages triggered from the polyfilled API.
  2. The user clicks "Register my wallet for use on other websites!". This click causes the website to call the registration API, which in turn sends a message to the authn.io iframe. The authn.io iframe renders a permission dialog that asks the user if they want to authorize https://digitalwallet.example to manage credentials.
  3. The user clicks "Allow". Now we have our first user interaction with authn.io. On Safari we try to get storage access, but we fail, because the user has never visited authn.io in a 1p context -- and, indeed, it never makes sense for them to do so anyway. At the time of implementation, this action also consumed the user activation.
  4. Next, we render a dialog letting the user know that we need to get permission to access their list of digital wallets so we can add a new one to it. I'm sure we've already lost most users at this point. But if they click to continue (another user interaction), we can open a popup window to authn.io in a 1p context.
  5. We show some more text and another authorize button in the 1p window. We need to get another user interaction here on authn.io directly in a 1p context -- and store a cookie in the 1p storage bucket. Note that we never wanted the user to be here in the first place, nor did we want to use cookies at all. In fact, this is less private than what we desired for them since we never wanted to send cookies to authn.io, but it's not terrible. So maybe a some users still click through so we can get this interaction, store the cookie, and then close the popup window.
  6. Now we're back in the authn.io iframe where we wanted to be able to store the digital wallet registration. We have to get another user interaction to request storage access again. So, we get the user to click another authorize / allow button.
  7. Now the user is faced with a confusing prompt, directly from the browser, about allowing tracking by authn.io and so on. The user must click through this to enable storage access.
  8. Finally, we've stored the user's digital wallet registration, i.e., a URL that contains the origin to the digital wallet website. Of course, we have stored it as a cookie here in Safari, because we can't store it using IndexedDB or localStorage, which is unfortunate because we don't ever need this data to hit the server in the first place. But -- now we're hit with the seven day retention policy and then the user's digital wallet registration will be deleted. So we're back to having to make the server set the cookies if we want to avoid this.
  9. Done!

So, that's just digital wallet registration on Safari using SAA. I'm not going to detail processing a get / store request, but suffice it to say, it's similarly bad UX -- and the digital wallet UI the mediator renders may then need to jump through similar hoops for the digital wallet site to get storage access if needed. But... once it's been done once, you can get a much nicer, seamless experience when doing get / store requests like you do today with Chrome. However, I suspect most users would never get that far before thinking someone was trying to scam them in some terrible way.

Now, alternatively -- we could just not use SAA and when the user needs to register a digital wallet, open a popup to authn.io directly and render the permission dialog there and use 1p storage. That's what we do today. You don't get a seamless experience (there are popup windows), but that's not so bad on mobile and it's still clearly better than some kind of "authorization wizard scavenger hunt".

Put simply, SAA was not designed for the use case we have. It was designed for embeds like Twitter, Facebook, and so on -- where users have some relationship with the embedded site. Users are not meant to have a relationship with a polyfill mediator component rendered by authn.io. So it just doesn't work right.

If you want to learn more about the CHAPI polyfill or see some animated GIFs of it in action, you can visit these sites:

https://chapi.io/ https://github.com/credential-handler/credential-handler-polyfill#features

johannhof commented 1 year ago

I see, thanks for detailing this, @dlongley! I agree that this isn't great for your use case and it doesn't seem like the web platform can really seamlessly support this kind of integration post 3rd party cookies right now. Maybe we can make progress in https://github.com/fedidcg/FedCM/issues/374 :)

I'll definitely keep this as a reference of how 1p relationship/interaction can be challenging for some use cases!

cc @cfredric @helenyc