Open LuHuangMSFT opened 4 years ago
cc @mgiuca I e-mailed you some ideas a couple of days ago, e-mail subject "Declarative Link Capturing + URL Handling". I'll replicate them here for discussion.
At a high level, we find ourselves agreeing that features which facilitate an in-app experience should first try to accomplish their goals using the app's navigation scope instead of creating a similar scoping mechanism. However, we think that link capturing could have difficulty gaining adoption if developers are unable to configure the set of URLs affected by the new feature. We would like to propose an adaptation of existing proposals.
There already seems to be a pattern in PWA development today where the scope is set to the root but various pages are excluded from the app experience using workarounds such as placing them under a different sub-domain, or using _target="blank" in links.
Examples to consider:
These examples can be addressed by allowing URLs to be excluded from link capturing.
Additionally, we would like to:
We think both examples (1. and 2.) can be addressed by link capturing exceptions. The default set of URLs included for link capturing is identical to the navigation scope, addressing (A). URLs within the default set can be excluded, and URLs outside the default set can be included, by modifying a separate association file. The use of an optional host field in url_handlers entries addresses (B). The optional capture_links defines the link capturing option for all include URLs. The link capture option for excluded URLs is none.
Below is an example manifest demonstrating usage.
{
"name": "Contoso App",
"start_url": "/?standalone",
"display": "standalone",
"icons": […],
"capture_links": "existing_client_event",
"url_handlers": [{
"association_file": "/web-app-site-association.json"
}, {
"host": "conto.so",
"association_file": "conto.so/.well-known/web-app-site-association.json"
}, ]
}
Example manifest format
In the manifest, url_handlers objects contain the optional host and location of the site association file. A url_handlers object without a host value can:
Structuring URL exceptions in the association file allows the site to control usage of URLs and also prevents duplication in the web app manifest. Additionally, for PWAs featured in app stores, this allows link capturing changes to be made without deploying an updated manifest to the store. An app/site handshake is not necessary for validating the association for in-scope URLs but using an association file for URL exceptions enables this deployment benefit.
It is unclear to me whether there should be only one capture_links option for the entire app or whether there is a benefit to allowing a choice for every url_handlers object. I am looking for a good example for the latter.
We also found an argument for not controlling link capturing behavior by modifying the navigation scope: if a URL that should be excluded from link capturing behavior is excluded from the navigation scope and is navigated to from within the app context without opening in a new window, it will render in a pseudo-browser frame. This behavior is correct because the URL is outside of the navigation scope, but users will find it difficult understand why a navigation in the same context now shows a pseudo-browser frame. There might be other similar side-effects.
Finally, I reused terminology above like url_handlers for continuity but we're eager to hear good alternatives.
Hi Lu,
There already seems to be a pattern in PWA development today where the scope is set to the root but various pages are excluded from the app experience using workarounds such as placing them under a different sub-domain, or using _target="blank" in links.
Yeah, I think we're seeing a lot of pain with not being able to exclude sub-paths from an app scope. I totally agree that sites shouldn't have to restructure their URL hierarchy in order to get the right scoping.
I'm still coming at this from a mindset of: this shouldn't be fixed just for link capturing. You should be able to exclude sub-paths from the actual app scope. Introducing sub-path exclusion as part of the link capturing feature feels like solving too specific of a problem. However, on the other hand, it does seem possible that sites would want to have sub-paths that are part of their app scope, but excluded from link capturing.
I'm considering this Pinterest example: if you were building a PWA with sub-paths like "About" and "Blog", you wouldn't want those to link capture I agree. But would you want them to still be part of your app scope if the user navigated into them whilst using the app? I suppose it would be best to not show the CCT UI when navigating into those pages, which means it would be useful to exclude paths from link capturing but keep them as part of the app scope.
A url_handlers object without a host value can:
- Only contain sub-paths to the navigation scope and is only allowed to control in-scope URLs.
- Only exclude URLs from the navigation scope but cannot include any additional URLs.
That sounds good. Then we're starting to separate the concept of cross-origin link capturing and in-scope customization of the URL scope. What is your intended syntax for excluding URLs (and how consistent is it with @wanderview's Service Worker Scope Pattern Matching proposal?) I don't think it is strictly necessary to be consistent with the service worker scoping, but it would be ideal to have one syntax for excluding sub-paths and not two.
Thanks for writing this up. I like the direction this is going.
On the original topic of integrating the two proposals: I don't think they need to be bundled together into one thing (i.e. land into the Manifest spec as a single PR). They can both exist independently.
I would prefer to keep them separate, since they are both quite complex on their own and the best way to deal with complexity is to break it into manageable components.
I think it is possible to use a syntax in the style of the manifest format stated in Service Worker Scope Pattern Matching (SWSPM) but the format does not match exactly. With a few modifications, it would not look too dissimilar from what was described in this explainer.
The manifest format for the value of "scopePattern" from SWSPM was given in this example:
{
"baseUrl": "https://foo.com/",
"path": "/app/?*"
}
Some observations:
Another format we studied was the apple-app-site-association file. It is able to match for paths, fragments, queries, and allows developers to order include and exclude paths by priority. This is a powerful declarative format but we are concerned about about matching performance and being able map it to OS URL handling formats.
Following the example above:
{
"name": "Contoso App",
"start_url": "/?standalone",
"display": "standalone",
"icons": […],
"capture_links": "existing_client_event",
"url_handlers": [{
"association_file": "/web-app-site-association.json"
}, {
"host": "conto.so",
"association_file": "conto.so/.well-known/web-app-site-association.json"
}, ]
}
"/web-app-site-association.json" would contain:
{
"apps": [{
"manifest": "/manifest.json",
"paths": ["/*"],
"exlude_paths": ["/blog/*"]
}]
}
"conto.so/.well-known/web-app-site-association.json" would contain:
{
"apps": [{
"manifest": "contoso.com/manifest.json",
"paths": ["/*"],
"exclude_paths": ["/blog/*"]
}]
}
- If URL Handling lands first without DLC, it would state the set of paths to capture, and define a basic model for opening a standalone window when links are captured. Later, DLC can land and extend it to say "here is how you customize the browser's behaviour when link capturing is activated."
- If Declarative Link Capturing lands first without this, it would simply state that link capturing applies to all URLs within scope of the manifest. Later, URL Handling can land and extend that to say "that's the default, but if you specify the exact URL Handling paths, it applies to those instead".
That sounds good to me. URL Handling can also refer to DLC on when to capture:
This proposal defines new manifest members that control what happens when the browser is asked to navigate to a URL that is within the application’s navigation scope, from a context outside of the navigation scope. It doesn’t apply if the user is already within the navigation scope (for instance, if the user has a browser tab open that is within scope, and clicks an internal link). The user agent is also allowed to decide under what conditions this does not apply; ...
I'm still coming at this from a mindset of: this shouldn't be fixed just for link capturing. You should be able to exclude sub-paths from the actual app scope. Introducing sub-path exclusion as part of the link capturing feature feels like solving too specific of a problem. However, on the other hand, it does seem possible that sites would want to have sub-paths that are part of their app scope, but excluded from link capturing.
I think excluding sub-paths from the actual app scope is useful in its own right, which SWSPM could address. Each manifest member applies to some set of URLs (oftentimes any URL, in-scope or not), and it's easy to reason about what URLs a member should be applied to, but not easy to reason about what URLs should be within the app scope. Scope restricts navigations within the app context with manifest continuing to be applied. By this definition, the manifest's members don't necessarily have to apply to in-scope URLs, they just cannot be applied to out-of-scope URLs. As you pointed out in this PR, the latter is not strictly true either.
In my interpretation, the "set of URLs that are considered to be part of an app" definition provides a convenient default set of URLs for manifest members to apply to but they don't necessarily have to apply to all in-scope URLs either. It also does not prevent out-of-scope URLs from being affected by a member (eg. link captured).
it does seem possible that sites would want to have sub-paths that are part of their app scope, but excluded from link capturing.
It's difficult to reason about this without knowing what being part of their app scope means in practical terms. Some of the earlier members apply to the app context itself (display, theme_color, etc) and can continue to be applied no matter where the context navigates. Setting some in-scope URLs to link capture but not others is more like setting one URL to be "start_url".
I like the definition at web.dev.
The scope defines the set of URLs that the browser considers to be within your app, and is used to decide when the user has left the app.
It is useful to let the app developer draw the boundary of what is within the app but it may not be the right boundary for every browser and manifest feature that depends on it.
I didn't realise you were proposing that the site association file be used even for same-origin link capturing. I'm not sure why we'd do that when you could just put the data inside the manifest file itself. Or are you suggesting that either is valid? (I suppose there's no reason to specifically say you can't host a site association file on the same origin, though our documentation shouldn't encourage it.)
Otherwise, the syntax seems reasonable, but I kinda wish we would have the SWSPM syntax in the Manifest for scope before / at the same time as the above lands. Otherwise, it feels weird to have the advanced syntax only for link capturing and not for scopes. It also puts us at risk, if say the TAG review on SWSPM encourages them to change their syntax slightly, then they become inconsistent with URL handling syntax.
I was proposing above that the scope exclusion patterns be placed only inside the association file, not either. Placing it inside the manifest file only is an option.
If scope exclusions for link capturing are placed inside the manifest only: Pros
Cons
Perhaps this syntax can be called out by SWSPM to be be reviewed also. That would help prevent inconsistency. Having the advanced syntax for link capturing and not scope temporarily doesn't seem too strange to me. The manifest scopePattern
from SWSPM is not significantly different from scope
yet unless it changes to allow multiple path patterns and exclude patterns.
I guess I would consider the associations file for in-scope to be part of the app's behaviour, and therefore it makes sense that it be part of the manifest itself. If listing an app in a store requires signing the manifest, and re-signing it whenever the manifest changes, it would seem strange to me that the link capturing URLs be excluded from that signing process.
What do you think of the following?
existing_client_event
except the event handler does not determine whether there is a match, just how to handle a launch. The handler will run in a new app window in the start_url document. No further navigation takes place if no handler is found.exclude_paths
.capture_links
.exclude_paths
, in the same object as capture_links
, orcapture_links
.capture_links
for capture behavior.exclude_paths
for exclusion patterns.capture_links
behavior only applies to in-scope URLs not excluded.exclude_paths
to be added to the manifests, it makes sense for it to remain there when transitioning to scenario B. Any duplication or conflict in the patterns added to the association file can be resolved by the algorithm used by the browser.capture_links
is used in the manifest, in-scope URLs can only be controlled by the exclude patterns in the manifest and will not be matched against patterns from any association file. capture_links
set and not none
: UNION(app scope, association file inclusions) - UNION(manifest exclusions, association file exclusions).capture_links
is none
or omitted, association file patterns will be ignored and no URLs will be link captured.
https://contoso.com/manifest.json
{
"name": "Contoso Business App",
"display": "standalone",
"icons": [
{
"src": "images/icons-144.png",
"type": "image/png",
"sizes": "144x144"
}
],
"capture_links": "existing_client_event",
"capture_links_exclude_paths": [
"/about",
"/blog"
],
"app_links": [
"contoso.com",
"conto.so",
"*.contoso.com"
]
}
https://partnerapp.com/manifest.json
{
"name": "Partnera PP",
"display": "standalone",
"icons": [
{
"src": "images/icons-144.png",
"type": "image/png",
"sizes": "144x144"
}
],
"capture_links": "existing_client_event",
"capture_links_exclude_paths": ["/only/for/partnerapp/*"],
"app_links": [
"contoso.com",
"partner.contoso.com"
]
}
https://contoso.com/web-app-site-association.json or https://conto.so/web-apps-site-association.json
[
{
"manifest": "https://contoso.com/manifest.json",
"handle_urls": {
"paths": [
"/*"
],
"exclude_paths": [
"/blog",
"/about"
]
}
},
{
"manifest": "https://partnerapp.com/manifest.json",
"handle_urls": {
"paths": [
"/public/data/*"
]
}
}
]
https://partner.contoso.com/web-app-site-association.json
[
{
"manifest": "https://contoso.com/manifest.json",
"handle_urls": {
"paths": [
"/*"
],
"exclude_paths": [
"/only/for/partnerapp/*"
]
}
},
{
"manifest": "https://partnerapp.com/manifest.json",
"handle_urls": {
"paths": [
"/*"
]
}
}
]
@mgiuca Is the above closer to what you had in mind?
Change summary:
capture_links_excludes
moved inside manifesturl_handlers
renamed to app_links
app_links
specifies origins, not hostscapture_links
and capture_links_exclusions
should be evaluated before app_links
Hi Lu,
Apologies for the lateness of this reply.
Regarding the above three scenarios:
Scenario B sounds good to me, which is good, because that's the final place where we want to end up regardless of which lands.
I feel like Scenarios A and C don't quite capture the orthogonality of the two APIs:
exclude_paths
for exclusion patterns."
exclude_paths
in my DLC proposal, and I was considering the path exclusion to be more a part of the URL handler proposal ("what is captured") as opposed to the DLC ("how it is treated, once captured").existing_client_event
except the event handler does not determine whether there is a match, just how to handle a launch. The handler will run in a new app window in the start_url document. No further navigation takes place if no handler is found."
new_client
), or do you actually intend to fire the launch event? Even though that event is eventually designed to be part of the DLC proposal, I am hesitant to bundle it up with URL handling, since then that could constrain the design of DLC later on. Again, URL handling should be "what is captured" as opposed to DLC, "how it is treated, once captured".The last thing that is concerning is this paragraph:
"This association file is a recommended format for validation but browsers are also free to use other non-web-standard formats like ones used by Android, iOS, and Windows."
That feels like a recipe for creating sites that work on one OS but not others. In terms of the standard, I would like us to only have the standard format. (There's nothing we can do to prevent implementations from also recognising non-standard formats, but the standard shouldn't explicitly allow it.) In terms of the Chromium implementation, I would like us to only recognise the standard format. (Of course, we can use non-standard formats to capture URLs into native apps, but when we're capturing URLs into web apps, we should force sites to present that in the standard format.)
Scenario C (DLC only): "Has a manifest member exclude_paths for exclusion patterns." I haven't got an exclude_paths in my DLC proposal, and I was considering the path exclusion to be more a part of the URL handler proposal ("what is captured") as opposed to the DLC ("how it is treated, once captured").
I was concerned that DLC wouldn't have a way to selectively exclude URLs from link capturing and that would prevent DLC from being adopted, but I am starting to understand what you mean by orthogonality: it keeps each from limiting the other. I'll modify this to keep the exclusion of in-scope URLs to the URL Handling spec alone. I.e. it'll only be available in scenario A and B.
Scenario A (URL handling only): "The capture behavior will be similar to existing_client_event except the event handler does not determine whether there is a match, just how to handle a launch. The handler will run in a new app window in the start_url document. No further navigation takes place if no handler is found." I'm not quite sure what this is saying. Does it always open a new app window (hence it's similar to new_client), or do you actually intend to fire the launch event? Even though that event is eventually designed to be part of the DLC proposal, I am hesitant to bundle it up with URL handling, since then that could constrain the design of DLC later on. Again, URL handling should be "what is captured" as opposed to DLC, "how it is treated, once captured". Do you have a specific need to reuse an existing window, and that's why this is being bundled in URL handling? If so, perhaps that's just a good reason to expedite DLC. (Which we are planning to tackle in Q4, for what it's worth.)
I think what I meant was URL Handling option 1 below:
existing_client_event behavior
|
DLC | URL Handling option 1 | URL Handling option 2 |
---|---|---|---|
No existing window open | opens a new window, navigates to given URL, does not fire launch event in new window | opens a new window, loads start_url document, fires launch event | opens a new window, loads start_url document, fires launch event |
Existing window available | fires launch event in one exisiting window | opens a new window, loads start_url document, fires launch event | fires launch event in one existing window |
Event handler not present | Only matters if existing window present. Since there is no handler, there will not no observable change to the user. | opens a new window to start_url. | no existing window: opens a new window to start_url. existing window: no observable change. |
For URL handling, if the URL is out of app scope, there is no good default behavior. The event handler needs to run regardless of whether there was an existing window in order to load content relevant to that URL. This is the difference I was trying to express.
We have no requirement to reuse an existing window. Additionally, I would rather that a new app window opens even if there is an app bug where URL handling is enabled but not handled. It's more obvious that there is an app bug in this case than if the user clicks on a link, expects URL handling behavior, then sees no observable change. This is why I prefer URL Handling option 1 to option 2.
I really don't want URL handling to touch "how it is treated, once captured" either but at the same time it needs a defined behavior in a universe without DLC/sw-launch. Maybe that is just a implementation concern, not a specification concern. Is it possible to not spec the behavior for URL handling, or spec as little as possible?
"This association file is a recommended format for validation but browsers are also free to use other non-web-standard formats like ones used by Android, iOS, and Windows."
That feels like a recipe for creating sites that work on one OS but not others. In terms of the standard, I would like us to only have the standard format. (There's nothing we can do to prevent implementations from also recognising non-standard formats, but the standard shouldn't explicitly allow it.) In terms of the Chromium implementation, I would like us to only recognise the standard format. (Of course, we can use non-standard formats to capture URLs into native apps, but when we're capturing URLs into web apps, we should force sites to present that in the standard format.)
Makes sense to me. I like that.
Closed by mistake.
Starting an issue to discuss integrating with the Declarative Link Capturing proposal. I will add more thoughts and references below.