w3ctag / design-reviews

W3C specs and API reviews
Creative Commons Zero v1.0 Universal
332 stars 56 forks source link

TAG review for web app `scope_extensions` #875

Closed diekus closed 2 months ago

diekus commented 1 year ago

Hola TAG!

I'm requesting a TAG review of scope-extensions.

This document describes a new scope_extensions manifest member that enables web apps to extend their scope to other origins. This allows sites that control multiple subdomains and top level domains to behave as one contiguous web app and also enables web apps to capture user navigations to sites they are affiliated with.

Further details:

You should also know that...

[please tell us anything you think is relevant to this review]

We'd prefer the TAG provide feedback as (please delete all but the desired option):

💬 leave review feedback as a comment in this issue and @-notify [luhuangmsft and diekus]

alancutter commented 1 year ago

The link to the security review needs updating: https://github.com/WICG/manifest-incubations/blob/gh-pages/scope_extensions-security-privacy-questionnaire.md

torgo commented 1 year ago

Hi @diekus – Thanks for sending us this. Briefly, we're concerned about the way that this proposal changes the same-origin model, which is a fundamental part of the security apparatus of the web. Hence we think we need to tread very carefully. We think the explainer should be very explicit about what the expanded scope does and does not allow access to. We'd also like to see some specific use cases and discussion of abuse cases (and how those abuse cases are mitigated). E.g. if you are tricked into visiting or downloading a malicious app that is spoofing your bank, and it includes your bank's origin in its scope_extensions field, are there additional exploits that the malicious party could exploit (e.g. obtaining credentials or capturing links)? Are there any implications for access to local storage from different origins?

LuHuangMSFT commented 1 year ago

@torgo Discussed with @diekus and below are our thoughts. I'll organize the important parts and add to the explainer. Thanks for your feedback.

To prevent spoofing attacks, the implementation in Chromium will flash the web origin of the content in the window title bar after every top level navigation. The origin information will also be visible in the app's main menu.

If the user is tricked into visiting or installing a malicious app that is spoofing my bank

To use scope_extensions, the owner of the app should either also directly own/control the listed origins in scope_extensions or monitor them closely if working by agreement with parties that own them.

Browser security tools such as Microsoft Defender SmartScreen should still identify unsafe origins that are navigated to from the app window.

ylafon commented 1 year ago

Hi, we discussed the issue during our breakout today. Could there be cryptographic proof that the added origins are agreeing to being embedded in that web app. Pretty much like Universal Link or App Link?

LuHuangMSFT commented 1 year ago

The apple-app-site-association file used by Universal Links references apps by an appID string of the format <Application Identifier Prefix>.<Bundle Identifier>. [1] I don't see usage of a cryptographic hash. Please correct me here if I'm missing something.

assetlinks.json, used by Android App Links, refers to apps by an app id and SHA256 fingerprints of the app's signing certificate. [2]

Use of a unique app id [3] should be sufficient evidence that the added origins are agreeing to being embedded in that uniquely identified web app. In the scenario where the app is signed or delivered as an immutable package, use of a cryptographic hash would be useful to further specify that the association is only valid when the app is unchanged. Being able to specify that the app remain unchanged doesn't seem like a useful feature for web apps with frequently changing content served through the web.

One scenario we should consider: if the web app is taken over by another party which does not have access to the original signing certificate, they would be unable to modify the app and produce cryptographic evidence matching the original - thus the origin association would become invalid.

The dominant method of delivery of web apps is over the web and managed by a browser without signing or packaging/bundling. Referencing web apps by unique app id is an acceptable solution that doesn't significantly complicate the steps developers need to take to set up the association.

To mitigate app takeover issues (where app ownership changes), we recommend that the web app and associated origins are owned and controlled by the same entity. Failing that, both the app and associated origins are advised to monitor ownership and condition of their counterparty.

[1] https://developer.apple.com/documentation/xcode/supporting-associated-domains [2] https://developer.android.com/training/app-links/verify-android-applinks#web-assoc [3] https://github.com/philloooo/pwa-unique-id/blob/main/explainer.md#requirements

alancutter commented 1 year ago

Could there be cryptographic proof that the added origins are agreeing to being embedded in that web app.

Is it sufficient that the added origin explicitly grants the web app permission in a file hosted on the origin served over HTTPS?

ylafon commented 1 year ago

In the case of apps, the verification of app links is done by the store owner, in the case of a web app, you need to have verification done in another way. This is similar to the origin issue in MiniApps Having something hosted on the origin server might work, but it would require network access.

To mitigate app takeover issues (where app ownership changes), we recommend that the web app and associated origins are owned and controlled by the same entity.

controlled in legal term (in which jurisdiction?) or in technical terms (as same hosting)

LuHuangMSFT commented 1 year ago

Controlled in practical terms - as it requires some level of control over the domain to be able to add /.well-known/ configuration. What's an appropriate way to phrase that in spec language?

ylafon commented 12 months ago

Well, the site has to agree that a web app can impersonate its content, so it is more than putting one file and be done (as other web apps could rely on its presence and do the same), there need to be something along the line of what is done in the ACME protocol, or else, in the case of offline web apps, a way to check this offline.

alancutter commented 11 months ago

as other web apps could rely on its presence

The site identifies a particular web app to associate with via the web app's manifest id, other web apps will be excluded by this.

in the case of offline web apps, a way to check this offline.

Does "offline" mean offline at all stages even during installation? A fully offline web app should probably be built as an IWA rather than use any HTTPS.

torgo commented 11 months ago

Hi @LuHuangMSFT can you update us and let us know your thoughts on @ylafon's question above?

Also – I don't think you want to draw a dependency to IWA from this proposal - can you confirm? Regardless of anything else it feels like this should be operating in secure contexts.

LuHuangMSFT commented 9 months ago

@ylafon The site identifies the web app by its unique id (*) in the site's .well-known/web-app-origin-association file. Other web apps cannot take advantage of this association, as they have different unique id.

Validation of associations should take place after the installed app is first launched. Validation should not succeed if the app is never online. If validation succeeds then afterwards the app remains offline, the associations will remain valid. Periodic revalidation takes place if the app is online. This is similar to how comparable schemes are implemented for Windows, Android, and iOS.

@torgo No dependency on IWA. I agree with @alancutter 's comment - i.e. there is no support here for fully offline apps that have no opportunity to do online validation with other sites.

(*) - https://github.com/philloooo/pwa-unique-id/blob/main/explainer.md

cynthia commented 9 months ago

Thanks for the response - who/what guarantees the uniqueness of the identifier? Without a single source of truth there isn't a way to guarantee uniqueness, and without that guarantee, it seems like we might have issues, such as enabling spoofing. Are we missing something?

LuHuangMSFT commented 9 months ago

Is the concern that https://siteb.com can spoof https://sitea.com by using the same app id? App IDs as described here and implemented in Chromium on desktop platforms are tied to app origin.

For an app from https://site-a.com to be installed with the same App ID as https://site-b.com, it would need to appear to the UA to have originated from https://site-b.com. Another possibility is that https://site-a.com changes ownership without the participating content origin's owner noticing.

I think it's possible for 2 separate apps from the same origin to claim the same app ID (start_url origin + specified ID) but we can understand this to mean there is only 1 app. Only one should be installable at a time.

torgo commented 8 months ago

Hi @LuHuangMSFT - just coming back to this now. I think the risk is not necessarily that https://siteb.com can spoof https://sitea.com, but rather than since https://sitea.com can serve content from https://siteb.com as if it comes from https://sitea.com that content from https://siteb.com can therefore completely hijack https://sitea.com without the user's knowledge. So if that understanding is correct then that breaks the security guarantee for web content. The proposal would need to mitigate against this risk in some way way - for example, scoping this very tightly so that it cannot be abused in this way. We also think the spec needs a privacy review and we would suggest that you request that separately. Also, as Sangwhan mentioned, we remain concerned about the uniqueness of the identifier.

LuHuangMSFT commented 7 months ago

since https://sitea.com/ can serve content from https://siteb.com/ as if it comes from https://sitea.com/ ...

I want to push back against saying that https://sitea.com/ would serve content from https://siteb.com. I would describe it as: both https://sitea.com/ and https://siteb.com/ can be displayed in an app window with the same app window treatment [1]. Either site would have its web contents displayed at the top level. Neither is enclosed in a frame or webview.

[1] In our implementation in Chromium, both https://sitea.com/ and https://siteb.com/ have their origin briefly displayed in the title bar. Also, origin information is clearly displayed in the window options menu. Do these UI treatments sufficiently mitigate the issue that users may be unaware of where the currently viewed content comes from?

There is a question of whether to users, the installed web app (which includes the recognizable app window, taskbar pins, shortcuts, and other assets) is equivalent to the site from which it was installed. If the sitea app is distinct from https://sitea.com, it wouldn't necessarily seem strange to also sometimes display content from https://siteb.com (provided users are sufficiently informed of the origin transition during navigations.)

In the same example, https://sitea.com explicitly says in its web app manifest that when installed, it allows the app window to host content from https://siteb.com. This is also an explicit endorsement of the content of https://siteb.com/ to the user.

What does "hijack" mean in this context? How is the security guarantee for web content broken?

dmurph commented 7 months ago

Voicing support for scope_extensions here from Google. This is a very important use-case for web developers trying to enable app experiences on the web platform, and comes from how organizations often don't use just a single origin for their product.

Feedback from @torgo: I believe this is already scoped very tightly:

Illustrative example:

Feedback for @cynthia:

Hopefully those constraints are enough to make this reasonable?

More abstractly - the feature is communicating & verifying 'what other origins are part of this app', so the user agent can work for the user accordingly.

torgo commented 7 months ago

Hi folks - I think we can best progress this review with a call where we can talk through the proposed use cases a bit more interactively. Given that we have the W3C Advisory Committee meeting coming up, I'd like to suggest that we hold this call in our of our TAG breakouts the week of the 22nd of April. I'll follow up by email to arrange.

hober commented 7 months ago
  • This only affects logic that asks if a given url is "in-scope" of an app. This is done currently in one place - when we choose whether or not to show the toolbar in web apps showing an 'out of scope' origin. Please see how this works with this site - if you install this site in Chrome or Edge, clicking on either of those links will demonstrate this toolbar. With scope extensions, the 'secondary' link will no longer show that toolbar.

How relevant is this feature outside of the Chromium project? When I add the example site to the Dock in Safari on macOS, I don't see whatever toolbar you're talking about.

LuHuangMSFT commented 7 months ago

Thank you to all who gave feedback at the breakout session.

I'm pleased we were able to address concerns about storage/cookies/permissions/preferences remaining separate by origin. We can and will call that out more clearly in the explainer.

We will also look into whether the information conveyed through the existing manifest field + web app origin association file can be expressed using feature-restricted URLPattern.

We will do some research on CORS extensions and reply with pros and cons of using this instead of web app origin association. Associated Website Sets is unsuitable as it operates on full domains instead of origins and does not have the primitives to select between different web app IDs on the same domain for association.

I do support the high-level goal of not introducing multiple ways of configuring similar things to the web platforms. I'm ready to consider and do due diligence on any alternative association mechanisms that fulfil our requirements for web apps. I want to reiterate that it seems reasonable to me for a web first app-to-website association format to exist (similar to windows-web-app-link, assetlinks.json, etc.) as app-to-website association has some different requirements from website-to-website association.

alancutter commented 7 months ago

How relevant is this feature outside of the Chromium project? When I add the example site to the Dock in Safari on macOS, I don't see whatever toolbar you're talking about.

UI when navigating out of scope is mentioned in the manifest spec: https://www.w3.org/TR/appmanifest/#nav-scope

If the application context's active document's URL is not within scope of the application context's processed manifest, the user agent SHOULD show a prominent UI element indicating the URL or at least its origin, including whether it is served over a secure connection. This UI SHOULD differ from any UI used when the URL is within scope of the application context's processed manifest, in order to make it obvious that the user is navigating off scope.

tomayac commented 7 months ago

How relevant is this feature outside of the Chromium project? When I add the example site to the Dock in Safari on macOS, I don't see whatever toolbar you're talking about.

macOS Safari just opens out-of-scope links in the default browser (nice touch!). It's relevant on iOS and iPadOS, where you do see an in-app browser with the Done button in the upper left corner.

torgo commented 6 months ago

Thanks @LuHuangMSFT - let us know when you have updated the explainer and we'll re-review. Much appreciated. I'd like to encourage you to include more info on abuse cases - and mitigations against these. Can you also tighten up the scope to the problem you're trying to solve and explicitly exclude things like permissions, local storage sharing, etc... Can you also please add an "alternatives considered" section of the explainer with some of the alternatives that we discussed in the call?

diekus commented 5 months ago

Hola TAG,

We've updated the explainer to provide a better sense of the feature we are developing:

We hope this addresses the concerns from the previous review and thank you for all the help getting to this current revision.

cc @dmurph @LuHuangMSFT

martinthomson commented 4 months ago

Those people who have been involved asked me to thank you for being responsive to feedback. This is very much appreciated.

Some questions from just me, from looking at this with somewhat fresh eyes:

Scopes are a path prefix on the current origin. This adds two types of scope extension: origin and registrable domain, but it is not clear to me what the concrete test is. Presumably, when a navigation link is presented in the "app", the container needs to make a call about whether this is an "in-app" navigation or it is "leaving the app" (which might result in a warning or opening the URL in the "main browser"). Currently, the test is url.startsWith(origin + scope). This extends that to include some number of additional tests, but these are not immediately obvious.

origin: is this the ENTIRE origin? If you imagine an external service that provides a limited service, it might be preferable to specify a limited scope on that origin. (Obviously, you can't just use the scope specified in the manifest, because it won't make sense on a different origin.)

registrable domain: I think that you want "site" here and want to use a same-site test for testing. This creates a broader match set that needs to be authorized on a per-origin basis, which is fine, but it has the same problem regarding scope on a per-origin basis. (Naming this "site" might be a better naming choice, with the value being a host rather than what appears to be an origin in the explainer. The scheme can be implicitly HTTPS.)

Your future extensions mentions:

More specific scoping e.g. scope suffix or include/exclude lists or URL patterns.

We tend to think that this is necessary - in two directions. That is, both the app and the new origin should be able to specify which resources are included, with the end result being the intersection of those sets (each could leave it unspecified if that is their choice).

@ylafon suggests URL patterns for this rather than path prefixes, though I'm personally less firm about that, because prefixes are easier to understand. The manifest has a path prefix in scope, which makes sense there, mostly because we didn't have URL patterns at the time. A URL pattern is pretty reasonable if you think it is adequately comprehensible.

Also, we observe that a single origin can include multiple apps with different manifests, but the authorization from the new site does not -- and cannot -- identify a single app. It's reasonable to say "I authorize any app", but it might be better to have it the authorization be scoped to specific apps, which might have different authorized scopes.

That might lead to an authorization like this, for a hypothetical payment provider:

{
  "web_apps": [
    {
      "site": "foo.example",
      "scope": "/payments/*",
    },
    {
      "origin": "https://bar.example.net",
      "scope": "/payments/*",
    },
    {
      "manifest": "https://example.com/app/manifest.json",
      "scope": "/payments/*",
    }
  ]
}

Another nit, but the whole { type: "foo", value: "bar" } construction is a bit redundant, how about making this simpler?

{
  "scope_extensions": [
    { "site": "example.com", "scope": "/*" },
    { "site": "foo.example", "scope": "/that_app/*" },
    { "origin": "https://bar.example.net", "scope": "/that_app/*" },
  ]
}
LuHuangMSFT commented 4 months ago

Thank you for the fresh review.

Current test: url.startsWith(origin + scope)

New test: url.startsWith(origin + scope) OR url.startsWith(any origin in scope_extensions) OR registrableDomain(url) == registrable domain in scope_extensions

origin: This is the entire origin where there is no additional scoping filter.

registrable domain: replacing this with "site" and using the same-site test seems like a viable option. We still want to allow the developer to be able to provide a single origin association file at the manifest-provided site to validate the scope extension. @dmurph what do you think?

Scope filtering

Allowing one or both sides (app manifest and origin/site) to filter the scope using one or more kind of filtering syntax is a good idea. "...each could leave it unspecified if that is their choice" is what we want to start with. This is restrictive for app devs initially but we believe it will unblock many developer scenarios and allow us to better understand developer needs and design requirements for filtering.

I think scope suffix in the origin association file is the most suitable first filtering syntax to implement as developers are already familiar with how this works and it is a fast test to perform. What do you think about specifying this even if it is not implemented initially or would be it better to leave the specification to future work?

There is a shared problem with a.) filtering from both sides and calculating the intersection and b.) using URLPattern: performance when matching URLs and the ability to convert the filter to OS-specific filtering syntax. The latter is necessary for OS integration features like URL handling where the OS performs the URL filtering. Both a.) and b.) could lead to slower filtering due to their complexity. Both a.) and b.) could be difficult to convert to OS-specific filtering syntax.

I still think specifying no filtering syntax or just suffix scope match initially is the best way forward until we do the work to study the performance and OS compatibility of filtering syntax options (URLPattern, etc).

Validation from origin/site configuration

The .well-known/web-app-origin-association file hosted by the participating origin identifies individual apps by their app id (web_app_identity in the explainer example):

{
  "web_apps": [{
    "web_app_identity": "https://example.com/"
  }, {
    "web_app_identity": "https://associated.site.com/"
  }]
}

Each object can be extended to apply different scope filtering for each app independently.

Type: Value

This isn't strictly necessary but makes parsing slightly more convenient as type is a required string field and the whole object can be skipped if type is not found or it is not a recognized type in this UA.

dmurph commented 4 months ago

Hello!

A few thoughts:

@martinthomson said:

origin: is this the ENTIRE origin? If you imagine an external service that provides a limited service, it might be preferable to specify a limited scope on that origin. (Obviously, you can't just use the scope specified in the manifest, because it won't make sense on a different origin.)

(later)

Another nit, but the whole { type: "foo", value: "bar" } construction is a bit redundant, how about making this simpler?

AFAIK this has been the only request from our partners. I think it would be reasonable to add a 'scope' parameter to the 'origin' type, or a new type. I worry about using a "bag of parameters" here as it can lead to complications if more options are added & trying to figure out how they might combine. (AKA - what if you specify both origin & site? we would have to specify what this behavior means, and future options increase the cross-product complexity). It's clearer to have this format:

"scope extensions": {
    { "type": "site", "value": "https://example.co.uk" },
    { "type": "origin", "value": "https://helpcenter.example-help-center.com" }
    { "type": "url_prefix", "value": "https://my_github_project.github.io/production/" }
}

(or - add a 'path_prefix' member for 'origin' types). Feel free to bikeshed "url_prefix". But this has not been a request from developers. If you feel strongly about needing this, that type is not complex to implement. Lu also mentioned better for cross-user-agent capabilities if this needs more - the functionality is based on the type, not based on arbitrary options.

@martinthomson said:

registrable domain: I think that you want "site" here and want to use a same-site test for testing. This creates a broader match set that needs to be authorized on a per-origin basis, which is fine, but it has the same problem regarding scope on a per-origin basis. (Naming this "site" might be a better naming choice, with the value being a host rather than what appears to be an origin in the explainer. The scheme can be implicitly HTTPS.)

This sounds good to me - the important part is that we use the public suffixes list, which that algorithm uses through obtaining the site.

@martinthomson

We tend to think that this is necessary - in two directions. That is, both the app and the new origin should be able to specify which resources are included, with the end result being the intersection of those sets (each could leave it unspecified if that is their choice).

As far as I know, this has definitely not been a request from partners. I worry this isn't necessary and increases the complexity of this feature. It also seems not necessary - the web-app-origin-association file is used to convey ownership / two way handshake of the site. With this established, the manifest is trusted to specify what it needs.

@martinthomson said:

Also, we observe that a single origin can include multiple apps with different manifests, but the authorization from the new site does not -- and cannot -- identify a single app. It's reasonable to say "I authorize any app", but it might be better to have it the authorization be scoped to specific apps, which might have different authorized scopes.

As Lu pointed out above, this is not the case. The web app identities are listed explicitly in the web-app-origin-association file.

martinthomson commented 4 months ago

I worry about using a "bag of parameters" here as it can lead to complications if more options are added & trying to figure out how they might combine.

That's easy. They all apply. If you have { origin: "https://example.com", site: "example.co.uk" }, then it doesn't match. But it leaves you the option of saying { origin: "https://example.com", scope: "/payments" } or similar.

this has definitely not been a request from partners

Your partners are not the only stakeholders here. Or maybe they didn't think that this was a problem for them.

The concern here is that a manifest on a different site could cause content from a completely different site to be included in that app. If a service accepts being part of an app, it loses granular control over which resources are included in that way. Think about X-Frame-Options or CSP's frame-ancestors rules. These can prevent content from being included as part of someone else's site in frames. Those can be specified on a per-resource basis, so that you can protect specific resources as necessary. Here, the choice is origin-wide, removing that option. I have no doubt that providing a unique origin for each partner is an option that some services will choose to exercise, but forcing that choice seems unnecessary when the fix is so trivial.

LuHuangMSFT commented 4 months ago

I'm flexible on:

martinthomson commented 4 months ago

Starting with a scope suffix per site/origin object (in the web-app-origin-association file) if this is sufficient to address concerns. Applying a scope to multiple origins that pass the same-site test places restrictions on the developer as the scope needs to make sense for all origins that pass the same-site test with their desired site.

I wasn't suggesting that the app manifest specify a scope. That is, you would still just say {"site": "service.provider.example"} in the manifest of the app that is trying to extend the app scope to the service provider origin. As you say, if you said {"site": "service.provider.example", "scope": "/foo"}, you are forcing multiple origins to all have the same structure. Then, though "app1.service.provider.example" might have "https://app1.service.provider.example/foo", "app2.service.provider.example" either has to accept the same conditions on the use of that scope or not use "/foo", which is bad either way.

But that is the app provider imposing its will on the service providers, which is basically an RFC 8820 violation. You want the service provider to speak for itself.

Instead, the opt-in from the service provider would list the apps that are authorized for use, plus a scope. That is naturally origin-scoped anyway. {"web_apps": [{"web_app_identity": "https://example.com/", "scope": "/foo"}]}, coming from "https://app1.service.provider.example/" would have the desired effect. And then "app1.service.provider.example" can make its own choice about what to include (or not), which will be nothing by default.

LuHuangMSFT commented 4 months ago

Instead, the opt-in from the service provider would list the apps that are authorized for use, plus a scope. That is naturally origin-scoped anyway. {"web_apps": [{"web_app_identity": "https://example.com/", "scope": "/foo"}]}, coming from "https://app1.service.provider.example/" would have the desired effect. And then "app1.service.provider.example" can make its own choice about what to include (or not), which will be nothing by default.

I think we're in agreement. The .well-known/web-app-origin-association file hosted by the origin/site should look like:

{"web_apps": [{"web_app_identity": "https://example.com/", "scope": "/foo"}]}

with scope being optional.

Above, I wrote:

We still want to allow the developer to be able to provide a single origin association file at the manifest-provided site to validate the scope extension.

I was trying to point out that if the developer uses a single web-app-origin-association file for multiple origins that pass a same-site test means that if it also specifies scope then scope applies to all origins that pass the same-site test.

martinthomson commented 4 months ago

I was trying to point out that if the developer uses a single web-app-origin-association file for multiple origins that pass a same-site test means that if it also specifies scope then scope applies to all origins that pass the same-site test.

Just because this is a little murky still. My point is that only an origin can speak for itself in this regard. That is, "https://a.example.com/" (an origin) can't speak for "https://b.example.com/" (a different origin), even if they are the same site.

Of course, if you have operational practices that mean you put the same file at the well-known location on both of those origins, that's fine, but that's not our business.

LuHuangMSFT commented 4 months ago

In the scenario I'm trying to address, the file is at https://example.com/.well-known/web-app-origin-association and it would get applied for both https://a.example.com and https://b.example.com.

martinthomson commented 4 months ago

OK, I don't think that is a good goal. Every origin has a different set of resources. Having one origin able to speak for others, even if it is same-site with those origins, breaks the fundamental scoping properties that underpin origins.

Yes, we have a privacy boundary at the site level and a bunch of stuff sort of expands to fill that scope by virtue of having some interaction with cookies. This does not.

dmurph commented 4 months ago

I agree that it's not great, but this is unfortunately a requirement due to the existing way the web is set up. The Zoom use-case is the best example here, but other companies are set up similarly:

So unfortunately allowing an organization / entity to say "any urls in this site can be considered part of my app" is a required use-case here that we can't remove :(

This is one of the reasons feature is very tightly scoped to just the 'scope' evaluation of a web app (e.g. what pages can be considered as part of this app, owned by the same company / entity). This matches what 'same-sites' mean - including the registrable domains check. In the last TAG meeting this was actually brought up as a potential issue - being too tightly scoped. However due to this same-site check this is one of the reasons it was kept as such a tight scope - no other feature should be able to use this.

So then there are two questions at the end of this:

LuHuangMSFT commented 4 months ago

Another reason not to require each origin that passes the same-site check to host a copy of the validation file is that the UA can only validate this information at navigation time and not when the web app is installed.

At the time the web app is installed, the UA needs to be able to validate the web-app-origin-association file from a finite list of origins in order to support behaviors such as:

reillyeon commented 3 months ago

@martinthomson, I'm just a bystander on this proposal but reading through your comments it sounds like you are generally comfortable with the idea of an origin opting-in to a scope extension from a cross-origin app but are specifically concerned with the ability for a site (registrable domain) to do so on behalf of the origins under it. This makes sense to me.

I am curious what you think of using a new pair of HTTP headers (e.g. App-Id and Allow-App-Scope-Extension) in a similar way to the Origin and Allow-Cross-Origin-Access we have for CORS. This would enable browsers to "trust but verify" on a site-level scope extension by checking that the server for the particular origin being navigated to agrees it wants a resource to be considered part of the app. This has the advantage over requiring each origin to host a copy of the .well-known/web-app-origin-association file that it provides the information in-line with the request that is already happening and is thus easier for browsers to implement and doesn't introduce additional navigation latency.

There are some details to work out but I think this would maintain the ability for the browser to do the necessary install-time setup for scope extensions while keeping ultimate control over how their resources are presented in the hands of the origin.

martinthomson commented 3 months ago

Interesting idea. How would that be realized though? You would send the target resource the App-Id, have it recognize that it is being framed it, then accept that with a field (Allow-App-Scope-Extension).

That would work in that it is sufficiently granular, but I have two concerns:

  1. Operationally, that seems more difficult to manage than the approach that @LuHuangMSFT and @dmurph are advocating for.

  2. As a practical matter, at the time that the request is made, a lot of stuff about a request has been predetermined (things like Sec-Fetch-Dest). I'm not sure whether a request for navigation in an app (i.e., what would happen if the resource is in the app scope) would differ from navigation in a browser (i.e., what would happen if it were not). Having to pick one and then back out if it fails would be good to avoid.

torgo commented 3 months ago

Hi @LuHuangMSFT - following on from what @martinthomson said above: as noted we are fine with the user need and we are generally fine with the design. @reillyeon has suggested a CORS or CORS-like approach. Whilst that has some advantages (in particular, not requiring a new well-known located file) it also has the disadvantage of requiring complex server configuration. Our main concern stands, however, that the Association file should allow for more specifying resources in a more fine-grained way. This could include wildcards to make it easy to create a very permissive Association file, but it should be possible to lock down the resources that are OK to share. Would you be OK with modifying the design to allow for such an approach? If so, we're happy to close this review as satisfied.

LuHuangMSFT commented 3 months ago

Update: I am investigating making 2 modifications to the design.

  1. For the case where the same-site format is used in the web app manifest to include a dynamic number of origins, add an additional requirement that response headers from an origin must include the manifest IDs of apps it agrees to be part of.

  2. Allow sites/origins to provide a 'scope' filter in their association files in .well-known path. This allows controlling resources using the same syntax as what is used in manifest scope.

I will update with more details after my investigation is complete.

LuHuangMSFT commented 2 months ago

Specifying scope in the association file

I've looked at the following options to give the association file a way to specify resources explicitly. My preference is for the fallback list. It's more verbose but allows us to accept URLPattern (or other formats) in the future. UAs are able to fall back to a simpler format following a clear order if they are unable to process a more complicated format. If left unspecified, the whole origin/site is considered part of the identified app with no resource restrictions.

Bag of things 1

{
  "web_apps": [
    {
      "web_app_identity": "https://myapp.com/index.html",
      "scope": "/app",  // Can be string or array of strings.
      "scope_url_pattern": "https://this-origin.com/?"  // Can be string or object.
      // Extend here.
    }
  ]
}

Bag of things 2

{
  "web_apps": [
    {
      "web_app_identity": "https://myapp.com/index.html",
      "scope": {
        "prefix": "/app",  // Can be string or array of strings.
        "url_pattern": "https://this-origin.com/?" // Can be string or object.
        // Extend here.
      }
    }
  ]
}

Fallback list

{
  "web_apps": [
    {
      "web_app_identity": "https://myapp.com/index.html",
      "scope": [
        {
          "url_pattern": "https://this-origin.com/?"  // Can be string or object.
        },
        {
          "prefix": "/app",  // Can be string or array of strings.
        }
        // Extend here.
      ]
    }
  ]
}
LuHuangMSFT commented 2 months ago

Request and Response Header

For multiple sub-domains included from the web app manifest using a same-site entry, we can make use of a request and response header design with no preflight. This allows an origin that was included via the same-site manifest entry to confirm its participation. This doesn't require an association file to be fetched immediately before fetching the resource and would not slow down navigation. This is largely what @reillyeon described above.

Example:

App window navigates to https://foo.com

The resource can load in the app window without warning UI or being moved to a tab. The response can be App-Scope-Extension-Allow-Id: * to match all app ids. If the app id does not match or there is no matching response, the resource is not treated as part of the app.

The server can either configure a static list of app ids for simplicity or dynamically control the value of App-Scope-Extension-Allow-Id and use this to implement scoping.

torgo commented 2 months ago

Hi @LuHuangMSFT thanks for this - we're just reviewing in our TAG plenary call today and we're going to get back to you after we have a chance to think about / discuss more deeply.

dmurph commented 2 months ago

Question - do we need to have the initial App-Id header? Can we only have the second one? The implementation of that seems possibly infeasible.

martinthomson commented 2 months ago

So I think that the requirement needs to be that the browser needs to know whether something is in app scope before navigating. Any solution that involves finding out at the time is going to introduce latency penalties that are undesirable.

For me, that rules out anything that is exclusively CORS-like. Navigation is commonplace and adding an extra round trip to all navigations from an app isn't a great outcome, even if it is only for cross origin navigations (we don't need to add that barrier structurally, even if a lot of applications insist upon it).

However, it isn't quite that simple. The app lists what it thinks is in scope, so the point of this design is to confirm. We would start from an assumption that the app is making a correct representation, then confirm that with the app. That isn't CORS-like, that's something new (-ish, the introduction of Sec-Fetch-Dest is somewhat of a confirmation stage, as are some of the other CORS headers that determine whether a response is readable).

So I'm of two minds here. I hate the constant addition of stuff to HTTP requests. It's really starting to get out of hand (especially with very long field names, the risk that requests exceed the size of a packet is real and that has serious performance implications). But Reilly's suggestion has real merit. I also see the advantages of following that model of confirmation.

@dmurph's concern about two fields is easy to answer, I think: the site likely needs to know who is asking before they answer. Because the answer could depend. Sites could have resources that are acceptably included in multiple apps.

The manifest-like approach is also reasonable. A centralized location where you can interrogate an origin about the scopes that can extend to it is OK. I find myself horrified at the complexity of the proposals though. Please, if we go that way, can we focus on what is the minimum possible syntax that will achieve the desired outcome. This works:

{
   "https://example.com/this-is-an-app/": ["/payments", "/anti-fraud-stuff"],
}

(Yes, it is not extensible, but it is replaceable and that is enough.)

However, throwing out ideas is not what we need here. What is needed is to ask one question: What is the form of this that site operators are best able to handle?

dmurph commented 2 months ago

From a chat with @LuHuangMSFT and a member of our security team, here is a proposal they were comfortable with. Not sure if this is something you're still OK with @LuHuangMSFT:

There are three levels of security for this association:

  1. None (not used)
  2. The .well-known/scope-extensions file acknowledgement on the extended origin/site.
    • If the origin/site doesn't have the manifest_id in that file, the association isn't created.
    • This can be re-checked periodically, allowing removal of associations but sometimes delayed.
  3. Both a .well-known/scope-extensions file acknowledgement on the extended origin/site AND a ScopeExtensionAllowed: <manifest_id_list> response headers coming from 'extended' sites or origins.
    • If the user agent ever does not receive this header from the server, then it removes the association. (Recoverable through existing manifest update mechanism if this was a dev mistake).
    • This allows a site/origin owner to immediately remove this association.

For origin type scope extensions, this requires level 2 security. For site type scope extensions, this requires level 3 security.

Reasoning:

Note this is separate from the filtering discussions, just for security here. Filtering can be handled by the current proposals with the .well-known/scope-extensions file and manifest file.

LuHuangMSFT commented 2 months ago

Thanks @dmurph. That's representative of what we discussed in that meeting. It is good solution that does not require a request header and I agree with the tiered approach for origin vs. site extension, as well as the reasoning bullet points.

App-Id is not necessary if the response header returns all manifest IDs that the origin recognizes. For completeness, I want to point out that not implementing an App-Id request header will not allow the server to control filtering differently for different origins. The filtering information (paths, URLPattern, etc.) will be in the association file and the same information will be applied to all origin matching a site extension.

Since

only requiring a response header with manifest IDs for site extension case is my preference.


Other issues

martinthomson commented 2 months ago

@dmurph that sounds encouraging. I'm a little unclear on the site vs. origin thing, but rather than continue the discussion here, can I suggest that you take this back to the explainer and update that?

LuHuangMSFT commented 2 months ago

I'll make an explainer PR that should offer more clarity.

martinthomson commented 2 months ago

Thanks for taking on the feedback. We understand that this is an early review and it will need some more work to integrate the changes. When you get things more settled, we'd be happy to do another view. Closing this for now.