w3ctag / design-reviews

W3C specs and API reviews
Creative Commons Zero v1.0 Universal
322 stars 55 forks source link

Wildcards in Permissions Policy Origins #765

Closed arichiv closed 1 year ago

arichiv commented 1 year ago

Wotcher TAG!

I'm requesting a TAG review of Wildcards in Permissions Policy Origins.

INITIAL PROPOSAL

The Permissions Policy specification “defines a mechanism that allows developers to selectively enable and disable use of various browser features and APIs.” One capability of this mechanism allows features to be enabled only on explicitly enumerated origins (e.g., https://foo.com/). This mechanism is not flexible enough for the design of some CDNs, which deliver content via an origin that might be hosted on one of several hundred possible subdomains.

This feature will add support for wildcard in permissions policy structured like SCHEME://*.HOST:PORT (e.g., https://\*.foo.com/) where a valid Origin could be constructed from SCHEME://HOST:PORT (e.g., https://foo.com/). This requires that HOST is at least eTLD+1 (a registrable domain). This means that https://\*.bar.foo.com/ works but https://\*.com/ won’t (if you want to allow all domains to use the feature, you should just delegate to *). Wildcards in the scheme and port section will be unsupported and https://\*.foo.com/ does not delegate to https://foo.com/.

Before, a permissions policy might need to look like: permissions-policy: ch-ua-platform-version=(self "https://foo.com" "https://cdn1.foo.com" "https://cdn2.foo.com" "https://foo.cdn2.foo.com/")

With this feature, it could look like: permissions-policy: ch-ua-platform-version=(self "https://foo.com" "https://\*.foo.com")

EXPANDED PROPOSAL

Subdomain wildcards in allowlists provided some valuable flexibility, but differed from existing wildcard parsers and required novel code and spec work. This intent will reduce that overhead by reusing parts of the existing Content Security Policy spec and permitting ‘scheme + wildcard domain’ and ‘wildcard port’ in the allowlist.

Specifically, this intent would adopt the definition of host-source and scheme-source instead of origin in the Allowlist definition while requiring that the path-part is empty (as Permissions Policies apply to matching origins). This would change three things from the prior wildcard implementation (all of which expand the range of allowed wildcards and none of which add new restrictions):

(1) Removing the eTLD+1 requirement for subdomain wildcards Previously, you could not have a subdomain wildcard like “https://*.com” but could have one like “https://*.example.com”. Now, you can have subdomain wildcards both like “https://*.com” and “https://*.example.com”.

(2) Allowing scheme restrictions on wildcard domains. Previously, you could allow “” but not restrict to a specific scheme like “https://” or “https:”. Now, you can still allow “” but have the option of delegating to just a specific scheme like “https://” or “https:” (the behavior of these is identical).

(3) Allowing port wildcards. Previously you could delegate to the default https port like “https://example.com” or “https://example.com:443” (the behavior of these is identical), but there was no way to explicitly delegate to all ports like “https://example.com:*”. Now, you can still delegate to “https://example.com” or “https://example.com:443” but delegation is also permitted to a wildcard port like “https://example.com:*”.

Further details:

We'd prefer the TAG provide feedback as (please delete all but the desired option):

🐛 open issues in our GitHub repo for each point of feedback

Security and Privacy questionnaire for TAG

  1. What information might this feature expose to Web sites or other parties, and for what purposes is that exposure necessary?
    • N/A, this feature exposes no new information to websites or other parties.
  2. Do features in your specification expose the minimum amount of information necessary to enable their intended uses?
    • Yes, no new information is exposed.
  3. How do the features in your specification deal with personal information, personally-identifiable information (PII), or information derived from them?
    • It does not deal directly in PII.
  4. How do the features in your specification deal with sensitive information?
    • It does not handle sensitive information.
  5. Do the features in your specification introduce new state for an origin that persists across browsing sessions?
    • No, permissions must be delegated on each page load and do not persist.
  6. Do the features in your specification expose information about the underlying platform to origins?
    • Yes, but no more than the existing permissions delegation can.
  7. Does this specification allow an origin to send data to the underlying platform?
    • Yes, but no more than the existing permissions delegation can.
  8. Do features in this specification enable access to device sensors?
    • Yes, but no more than the existing permissions delegation can.
  9. What data do the features in this specification expose to an origin? Please also document what data is identical to data exposed by other features, in the same or different contexts.
    • Data that can already be delegated by permissions is exposed, there is no new data being exposed.
  10. Do features in this specification enable new script execution/loading mechanisms?
    • No
  11. Do features in this specification allow an origin to access other devices?
    • Yes, but no more than the existing permissions delegation can.
  12. Do features in this specification allow an origin some measure of control over a user agent’s native UI?
    • No
  13. What temporary identifiers do the features in this specification create or expose to the web?
    • Nothing beyond what's currently possible with permissions delegation.
  14. How does this specification distinguish between behavior in first-party and third-party contexts?
    • The first-party context is in charge of which permissions are delegated to third-party contexts, and third-parties cannot increase their scope of delegated permissions.
  15. How do the features in this specification work in the context of a browser’s Private Browsing or Incognito mode?
    • It will work the same in such contexts.
  16. Does this specification have both "Security Considerations" and "Privacy Considerations" sections?
  17. Do features in your specification enable origins to downgrade default security protections?
    • No
  18. How does your feature handle non-"fully active" documents?
    • N/A
  19. What should this questionnaire have asked?
    • N/A
npdoty commented 1 year ago

Does this provide an easy, dangerous mistake for web developers? Currently, the developer has to delegate the origins specifically: I might decide I want this particular feature to be accessible at feature.example.com and I have to make a case-by-case decision whether I want the feature to also be available at other-service.example.com. With this change, a developer could be encouraged to just provide the feature to *.example.com, which will surely all be subdomains that the developer controls and wants the feature on. Later, when a third-party provider requests a CNAME for a subdomain (analytics.example.com), that service automatically gets access to all those features, inadvertently.

That is, in your questionnaire answers, you repeatedly note that this isn't creating any new capabilities. But is it encouraging accidental expansion of a capability to potentially many different origins?

Does this change introduce a new dependency on the PSL? What happens if the PSL is out of date or a site is accidentally included/not included?

arichiv commented 1 year ago

It's true that a developer might be encouraged by this new feature to allow *.example.com where as they would previously have to manually add subdomains. That said, they could currently be being pushed to use just * due to their CDN needing 100+ subdomains whitelisted. This seems a safer corner to be backed into.

I was planning to depend on the PSL by referencing this language. As the permissions policy directives aren't cached beyond the lifetime of the page load, if the list is out of date it could result in sites being delegated (or not being delegated) permissions. If an invalid target is detected in the list that target would be ignored without throwing away the rest of the targets (i.e., https://*.example.com/ https://*.org would parse the same as just https://*.example.com/).

ylafon commented 1 year ago

I have a few questions about matching: Does https://www.example.com:443/ match https://*.example.com/ ? (implicit port equivalence) Is it possible to use the wildcard as part of a token or to replace more than one token, ie: https://*-assets.example.com/ matching https://img-assets.example.com/ or https://*.example.org/ matching https://two.levels.example.org/ or should it be https://*.*.example.org/

Also, to avoid only opening everything, would it be possible to have negative matches? Thinking of something like Permissions-Policy: geolocation=(self "https://*.example.com/" "!https://notallowed.example.com/") this would be useful if there are a few subdomains to exclude, or combined with partial matches like the question above !https://*-no.example.org'

arichiv commented 1 year ago

I have a few questions about matching: Does https://www.example.com:443/ match https://*.example.com/ ? (implicit port equivalence)

Yes

Is it possible to use the wildcard as part of a token or to replace more than one token, ie: https://*-assets.example.com/ matching https://img-assets.example.com/ or https://*.example.org/ matching https://two.levels.example.org/ or should it be https://*.*.example.org/

No, you can only ever have one wildcard and it must have a . after it and the // before it. https://*-assets.example.com/ is invalid, https://*.*.example.com/ is invalid, https://foo.*.example.com/ is invalid, https://*.example.com/ is valid.

Also, to avoid only opening everything, would it be possible to have negative matches? Thinking of something like Permissions-Policy: geolocation=(self "https://*.example.com/" "!https://notallowed.example.com/") this would be useful if there are a few subdomains to exclude, or combined with partial matches like the question above !https://*-no.example.org'

We want to keep this first foray into wildcards to the minimum feature set to support the use case (CDNs with hundreds of subdomains). In the future, more flexible matching (like what's supported in the Content Security Policy) might be added, and later down the line support for negation could then be considered in both spots. It's a lot easier to expand the syntax if we need more later than to shrink it if we over-extend now.

ylafon commented 1 year ago

We discussed this issue today in our call and came to the conclusion that a way to mitigate the "all on or all off" effect of the wildcard as currently defined. Something as simple as possible, like allowing only one partial match (but we let you discuss what could be an acceptable solution) would be good to avoid people using it too broadly "to make it work".

arichiv commented 1 year ago

I'm not sure I understand. It doesn't look like the minutes have been published yet, but I can read the notes from the 10/10 meeting when they're available.

torgo commented 1 year ago

Hi @arichiv the minuted discussion is here. Also cc @cynthia.

ylafon commented 1 year ago

@arichiv did you have time to read our feedback? Any plan on working on it?

arichiv commented 1 year ago

Yes! Sorry for the delay, I hope to get to the expansions proposed Q1 2023.

torgo commented 1 year ago

Ok thanks for letting us know. Please let us know updates when you can.

arichiv commented 1 year ago

There has been a further expansion of this proposal. Specifically:

Subdomain wildcards in allowlists provided some valuable flexibility, but differed from existing wildcard parsers and required novel code and spec work. This intent will reduce that overhead by reusing parts of the existing Content Security Policy spec and permitting ‘scheme + wildcard domain’ and ‘wildcard port’ in the allowlist.

Specifically, this intent would adopt the definition of host-source and scheme-source instead of origin in the Allowlist definition while requiring that the path-part is empty (as Permissions Policies apply to matching origins). This would change three things from the prior wildcard implementation (all of which expand the range of allowed wildcards and none of which add new restrictions):

(1) Removing the eTLD+1 requirement for subdomain wildcards Previously, you could not have a subdomain wildcard like “https://\*.com” but could have one like “https://\*.example.com”. Now, you can have subdomain wildcards both like “https://\*.com” and “https://\*.example.com”.

(2) Allowing scheme restrictions on wildcard domains. Previously, you could allow “*” but not restrict to a specific scheme like “https://*” or “https:”. Now, you can still allow “*” but have the option of delegating to just a specific scheme like “https://*” or “https:” (the behavior of these is identical).

(3) Allowing port wildcards. Previously you could delegate to the default https port like “https://example.com” or “https://example.com:443” (the behavior of these is identical), but there was no way to explicitly delegate to all ports like “https://example.com:*”. Now, you can still delegate to “https://example.com” or “https://example.com:443” but delegation is also permitted to a wildcard port like “https://example.com:*”.

torgo commented 1 year ago

Thank you for the update! We've reviewed the latest changes and they look good. We've also noted the positive developments on multi-stakeholder support on the Mozilla and Webkit standards positions threads. Despite the fact that we would like to see intersections in matching, the fact that is now consistent with CSP is good. We're happy to see this move forward and on that basis we're going to close this review.