w3c / manifest

Manifest for web apps
https://www.w3.org/TR/appmanifest/
Other
653 stars 159 forks source link

Add a unique identifier for a PWA #586

Closed adewale closed 2 years ago

adewale commented 7 years ago

If one is building a PWA Directory or App Store or search engine that detects PWAs one needs a way to uniquely identify a PWA from just the manifest.

Currently the spec doesn't explicitly say what that identifier or tuple of identifiers should be which leads to issues like: https://github.com/GoogleChrome/gulliver/issues/323

marcoscaceres commented 7 years ago

@adewale, thanks! we will see how the gulliver project handles that and if a solution emerges.

mgiuca commented 6 years ago

The spec does specify a tuple that uniquely identifies the app, but unfortunately it's got a huge problem (I thought there was a bug on it but I can't find one, so I just filed #668).

The steps for processing a manifest are given by the following algorithm. The algorithm takes a string text as an argument, which represents a manifest, and a URL manifest URL, which represents the location of the manifest, and a URL document URL.

This means that the entire identity of the app is uniquely determined by the tuple (text, manifest URL, document URL). Though practically, since text can be derived from manifest URL, it means just the pair (manifest URL, document URL).

I think the fact that it is a function of document URL is a problem, as outlined in #668. If we fix that, then the manifest URL becomes the sole unique identifier of an app.

(Note: The Service Worker URL does not need to be included as part of the identifier. The SW is an implementation detail of the app --- a detail that we require, but we do not need to know where it lives, what its scope is, etc.)

ithinkihaveacat commented 6 years ago

https://pwa-directory.appspot.com/ has a collection of 1366 manifests; of these around 71 (5%) look "versioned":

$ curl -sSL 'https://pwa-directory.appspot.com/api/pwa/?limit=4000' | jq -r '.[] | .manifestUrl' | sort | perl -ne 'print if /[0-9a-fA-F]{7}/ || /v[0-9]+/ || /v=/'
https://ademola.adegbuyi.me/_nuxt/manifest.1c4bdc21.json
https://app.mangahigh.com/fea_201803191329/misc/mobile-manifest.json
https://assets.production.spokeo.com/assets/v9/manifest-25a702bcac88b536992cff4cc78d9e75d7d40dc36f746ed69604a2c40d0aba5d.json
https://beta.mic.com/manifest.json?b=1478894131181397
...
https://ademola.adegbuyi.me/_nuxt/manifest.1c4bdc21.json
https://app.mangahigh.com/fea_201803191329/misc/mobile-manifest.json
https://assets.production.spokeo.com/assets/v9/manifest-25a702bcac88b536992cff4cc78d9e75d7d40dc36f746ed69604a2c40d0aba5d.json
https://beta.mic.com/manifest.json?b=1478894131181397
https://betcruncher.com/manifest.3183cc2d8ff6fa85748fc8c6a4f796cd2a95d2e9.json
https://big-andy.co.uk/content/themes/v5/manifest.json
https://blackjack.io/manifest.9f463e8a23e16b31f7219dce967e1df6.json
https://blendle.com/manifest-5a96b3b4ec.json
https://boardom.io/manifest.json?v=3
https://bookourplane.com/manifest.json?v=LbbRAnjJQL
https://browsersync.io/manifest.json?v=qAqkxQaJm0
https://cdn.bloodhorse.com/current/favicons/manifest.json?v=KmbG9gpjz7
https://cdn.getyourguide.com/static/c6754d394589/customer/desktop/static/manifest.json
https://cdn.lyft.com/webclient/icons-463e5ce/manifest.json
https://cdn.shopify.com/s/files/1/0014/1962/t/21/assets/manifest.json?17982843544509738478
https://choualbox.com/manifest.json?v=1282
https://clay.io/manifest.json?data=eyJpY29ucyI6W3sic3JjIjoiaHR0cHM6Ly9jZG4ud3RmL2QvaW1hZ2VzL3N1cGVybm92YS9pY29uLnBuZyIsInNpemVzIjoiMjU2eDI1NiIsInR5cGUiOiJpbWFnZS9wbmcifV0sInNob3J0X25hbWUiOiJDbGF5IEdhbWVzIiwibmFtZSI6IkNsYXkgR2FtZXMiLCJzdGFydF91cmwiOiIuLz91dG1fc291cmNlPXdlYl9hcHBfbWFuaWZlc3QiLCJiYWNrZ3JvdW5kX2NvbG9yIjoiI2ZhZmFmYSIsInRoZW1lX2NvbG9yIjoiI2ZmOGEwMCIsImRpc3BsYXkiOiJzdGFuZGFsb25lIn0=
https://cs1.wettercomassets.com/wcomv5/images/icons/favicon/manifest.json?201708031719
https://d1c42d2bmccy49.cloudfront.net/manifest.json
https://dev-quests.appspot.com/static/manifest.b9d743cdb670650edbb180662a9443e56add2d7fcbc9e7c5d7f73c7bfd20ded5.json
https://developer.chrome.com/devsummit/static/manifest.32a1e88bd98d232c73fbf2f2c5ff552b4c9782f991d6114a3ffa17c5f9390528.json
https://devpractic.es/notifmanifest.php?v=635
https://direct.asda.com/on/demandware.static/-/Sites-ASDA-Library/default/dwb2a11ac9/Manifest/manifest.json
https://ephemeral.now.sh/manifest.91ccc2dacd83c8815c8286043c23a9ae.json
https://erwinandres.github.io/tudu/manifest.json?v=2
https://facerepo.com/app/images/favicons/manifest.json?v=a701bd98
https://feeddeck.glitch.me/manifest.json
https://flat.io/manifest.json?v1
https://grocery.walmart.com/js/icons-4b00caed44fcb95f57dd4efc82d1a2c2/manifest.json
https://hn.nuxtjs.org/_nuxt/manifest.d7491a08.json
https://hpbn.co/7a58c37113db4464699ec4f4646b5566.json
https://jimdo-dolphin-static-assets-prod.freetls.fastly.net/cms/static/manifest.c4bb9662.json
https://kuranz.com/manifest.0252de652255e03775ee2f57d96ec003.json
https://m-travel.jumia.com/manifest.9cd19691.json
https://m.apkpure.com/manifest_v10.json
https://m.avito.ru/s/mobile/web-app-manifest.json?5e1ff91
https://m.badoo.com/badoo/manifest-en.json?v101
https://m.gala.de/r1519124462501/manifest.json
https://magento-imagine-2018.firebaseapp.com/_nuxt/manifest.555d3617.json
https://magnetis.com.br/assets/magnetis_app/manifest-638829635f8669ddb668e944e37aee4241964bd32d7f2dd857d1d7c8e16e8bfd.json
https://memoui.com/static/20180412001537/manifest.json
https://motog3.com/wp-content/plugins/onesignal-free-web-push-notifications/sdk_files/manifest.json.php?gcm_sender_id=995691934152
https://preact-pwa-yfxiijbzit.now.sh/manifest-a57e627c89.json
https://prpl-dot-captain-codeman.appspot.com/20170806/es6-unbundled/manifest.json
https://quillie.net/manifest.38100eca.webmanifest
https://reittiopas.foli.fi/icons-turku-6aa88e8a010a06d1d30d24205371f8d3//manifest.json
https://rofr.in/manifest.json?v6=bOO8oaa856
https://schsrch.xyz/resources/0350094f9232803bcc0fd86c3cbd31f1.json
https://sp-web.search.auone.jp/manifest_v2.json
https://ssl.tzoo-img.com/res/favicon/manifest.json?v=2kq2msw2
https://static1-ssl.dmcdn.net/images/neon/favicons/manifest.json.vb58fcfa7628c92052
https://static3.1tv.ru/assets/web/favicon/manifest-1d3e08042839f3a7499da28ea190f0d5.json
https://theomg.github.io/Lifelike/manifest.ee9a11377982a365a8aeae5b9095fe11.json
https://townwork.net/js/manifest?v=20160302001
https://travel.jumia.com/manifest.9fa818c2.json
https://unacademy.com/dist/manifest.json?1487235791853
https://unacademy.com/dist/manifest.json?1505821603929
https://weather.com/weather/assets/manifest.507fcb498f4e29acfeed7596fe002857.json
https://webamp.org/manifest.60fc98cc18ea0b3ab073cda74610efa1.json
https://www.amarujala.com/manifest.json?v=85b484467f
https://www.boldsky.com/browser.json?v=1.0.1
https://www.buzzfeed.com/static-assets/data/manifest.0edfa72a42a9e70e5bf211f64eae9384.json
https://www.colorblindsim.com/manifest.8b7a3d31.webmanifest
https://www.cookscountry.com/_search_assets/cco-manifest-707681872ff6b432492f3fe509aaae89.json
https://www.elo7.com.br/v3/manifest/webapp.json
https://www.freecharge.in/mobile/manifest.json?v=1
https://www.ft.com/assets/manifest/manifest-v6.json
https://www.gp.se/polopoly_fs/3.200.1523348202!/sites/se.gp/images/manifest.json
https://www.iheart.com/manifest.6a2f10c7f194b2a76747f18937e42951.json?rev=7.44.0
https://www.imperialcarsupermarkets.co.uk/manifest.json?v=gAEgYPxJpw
https://www.istitlaa.me/_nuxt/manifest.57352a3d.json
https://www.johnlewis.com/assets/fc539d9/favicons/manifest.json
https://www.koolsol.com/manifest-20170311-01.json
https://www.koolsol.com/manifest-20170526-01.json.php
https://www.liverpoolecho.co.uk/manifest.json?v=548e74556b39b6b25a2b7a4828f7783e
https://www.nouvelobs.com/manifest.json?1510150956
https://www.onthemarket.com/assets/52bbb4af/gzip/js/manifest.json
https://www.onthemarket.com/assets/80f6edfa/gzip/js/manifest.json
https://www.openrent.co.uk/manifest.json?v=9BaGKJ78xe
https://www.otto.de/static/all/img/global-resources/fc44d9d421d3577b/favicons/manifest.json
https://www.otto.nl/3ce8d08884c912ec9b98774bab49a8eff3604010/assets/ottonl/resources/manifest.json
https://www.padpiper.com/manifest.ab5a95547c7ae8833813533907eb0631.json
https://www.pigiame.co.ke/assets/pi-site/favicon/site-ad611bc177.webmanifest
https://www.pitchup.com/manifest.json?v=4
https://www.pricehipster.com/manifest.json?v=1
https://www.reittiopas.fi/icons-hsl-18da13427c6e362f148f4a5b783ee98c//manifest.json
https://www.selcobw.com/skin/frontend/selco/default/assets/manifest.json?6335544
https://www.sho-yamane.me/_nuxt/manifest.7e00d6b4.json
https://www.stylewe.com/manifest.json?v=9255619
https://www.thekitchn.com/assets/tk/favicons/manifest-8afd9804080ba4ee9351cb5adc20383f47f40fe276d62bd25467bdadf5d5c0d6.json
https://www.viz.com/favicon/manifest.json?v=oLLRlE8ljO
https://www.walmart.ca/assets/9d1a7c78e21cc1c3c71ae9f8a8918b0d-home-screen-manifest-en.json
https://www.yiv.com/manifest.json?2017022101


So if Chrome and others switch to using manifest URL to uniquely identify PWAs (and this data is representative of PWAs in general), then around 5% of sites will generate a new A2HS prompt when the manifest URL changes (perhaps only when the content of the manifest changes, but potentially on every deployment).

(Is Chrome using manifest URL right now? I tried changing the manifest URL on a test site and didn't get the A2HS prompt. So I suspect Chrome is currently applying a different heuristic to identify new/updated PWAs.)

mgiuca commented 6 years ago

Interesting that so many of them are versioned. I wonder where this advice comes from? Could it be that "best practice" with service workers is to version all assets, and the manifest is just being versioned along with that?

So if Chrome and others switch to using manifest URL to uniquely identify PWAs ... then around 5% of sites will generate a new A2HS prompt when the manifest URL changes

I think there's still some confusion here. Chrome already uses the manifest URL to uniquely identify an app. If the manifest URL changes, it's a different app. There are no changes to Chrome that need to be made along these lines (this bug is to document this in the spec, which I think is reasonable).

Is Chrome using manifest URL right now? I tried changing the manifest URL on a test site and didn't get the A2HS prompt. So I suspect Chrome is currently applying a different heuristic to identify new/updated PWAs.

I think it is. If you change the manifest URL you should get a new app. Theories for why you aren't:

marcoscaceres commented 6 years ago

@mgiuca wrote:

Interesting that so many of them are versioned. I wonder where this advice comes from? Could it be that "best practice" with service workers is to version all assets, and the manifest is just being versioned along with that?

Yeah, @jakearchibald and friends were promoting this a while back as part of SW development (that's not to point fingers - caching is hard, and that approach works well). However, it's still a hack... and like all hacks, it has pros/cons. Additionally, the file hashing may be baked into some developer/command line tools, like webpack - but Jake probably knows more.

jakearchibald commented 6 years ago

Pretty sure I never recommended versioning manifests specifically. But it's good practice to version assets and treat their URLs as immutable generally. This isn't anything to do with service worker, it's just good caching practice https://jakearchibald.com/2016/caching-best-practices/.

If manifest is an exception to the rule, we need to do some dev rel'ing so folks understand why. The service worker script url is one of these exceptions, and I documented it here https://developers.google.com/web/fundamentals/primers/service-workers/lifecycle#avoid_changing_the_url_of_your_service_worker_script.

mgiuca commented 6 years ago

@jakearchibald Yeah the manifest URL should have the same policy applied as the SW URL --- it's probably more important since you can migrate to a new SW URL (just takes some fiddling) but it's not generally possible to move to a new manifest URL (without segmenting your installed base).

(Aside: I think we should support updating manifest URL using HTTP 301 Moved Permanently; not sure if this needs to be specced or if we can just implement this.)

alancutter commented 5 years ago

This issue has come up again in the context of updating installed PWA manifest data.

I don't think making an app identified by its manifest is reasonable long term. Sites should be able to re-architect their directory structure/web framework necessitating a change in manifest URL during the lifetime of a user install.

I think we should add an optional "id" field to the manifest that defaults to the manifest URL but can be overridden with whatever the site likes. This ID will be scoped to the start_url's origin and cannot collide with IDs from other origins. This would enable sites to update any aspect of their manifest except their origin and the id.

adewale commented 5 years ago

Please give some examples of these proposed IDs.

On Fri, 12 Jul 2019 at 05:04, alancutter notifications@github.com wrote:

This issue has come up again in the context of updating installed PWA manifest data.

  • When a site has changed its name/scope/theme_color/start_url/manifest URL how do we know we're looking at the same app and not a different one that shares the same scope?
  • When a PWA installation is synced across devices how do we know the sync has been satisfied when sites may have arbitrary device specific differences in their metadata?

I don't think making an app identified by its manifest is reasonable long term. Sites should be able to re-architect their directory structure/web framework necessitating a change in manifest URL during the lifetime of a user install.

I think we should add an optional "id" field to the manifest that defaults to the manifest URL but can be overridden with whatever the site likes. This ID will be scoped to the start_url's origin and cannot collide with IDs from other origins. This would enable sites to update any aspect of their manifest except their origin and the id.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/w3c/manifest/issues/586?email_source=notifications&email_token=AAAKU7CBTGKS2MRTV72MHOLP67YDFA5CNFSM4DQH6PJ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZYRHFI#issuecomment-510727061, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAKU7CE7U7CS435RIKYIDTP67YDFANCNFSM4DQH6PJQ .

alancutter commented 5 years ago

The id field will default to the manifest URL e.g. "https://app.com/manifest.webmanifest" but can be any string e.g. "jdklklfpinionkgpmghaghehojplfjio". The actual app ID will be a tuple of (start_url origin, manifest id) e.g. ("https://app.com/", ""jdklklfpinionkgpmghaghehojplfjio").

mgiuca commented 5 years ago

Note that the ID itself will be a totally meaningless (to the web platform) string; it's just an opaque token that uniquely identifies the app within the origin's namespace (so there are no naming conflicts between origins, but you must be careful to uniquely identify your app within your own origin).

We would probably recommend that the ID be a URL relative to the origin, since that would guarantee uniqueness, but we wouldn't derive any meaning from it.

The default of it being the manifest URL would be to preserve the historical fact that the manifest has uniquely identified the app.

marcoscaceres commented 4 years ago

I like @mgiuca's idea (https://github.com/w3c/manifest/issues/586#issuecomment-510761731) of the id just being a meaningless URL resolved against the manifest URL.

mgiuca commented 4 years ago

the id just being a meaningless URL resolved against the manifest URL

That's not quite what I was suggesting. I was saying it's a meaningless string (doesn't have to be a URL at all). It's an arbitrary character string, that isn't resolved against the manifest URL; it forms part of a unique key, in a pair with the origin (so that two origins with the same id won't collide).

marcoscaceres commented 4 years ago

ah, sorry, I misread. I still like the idea :)

benfrancis commented 4 years ago

I remember this topic being debated at some length in the sysapps working group in about 2013. My personal opinion has always been that the manifest URI alone should be treated as the identifier of a web application and a different manifest URI should be assumed to be a different application.

Some of the reasons being:

  1. It provides a simple URI as an identifier to use as an index in a database of apps, which also happens to resolve to the metadata describing the app
  2. The manifest URI can be resolved periodically by the user agent to check for updates
  3. No ambiguity over whether two applications within the same origin/sharing the same start_url/sharing the same scope/claiming the same internal ID are the same app or different apps

In implementing the manifest specification recently I found it a real pain trying to use some kind of combination of the origin/start URL/manifest URL/content hash as an identifier and in the end gave up and just used the manifest URL anyway.

I understand why people might want to version the manifest URL and caching is indeed hard, but I would argue there are other solutions to that problem. Cool URI's don't change.

mgiuca commented 4 years ago

I appreciate the sentiment that Cool URIs don't change (especially since that page seems to have existed for 22 years at the same URL). But the reality is, developers do want to change their URLs, including the manifest, not just for versioning but to keep their site organised.

The problem as I see it is that we've never specified what makes a unique identifier for an app. So implementations can use the manifest URL, but that's essentially creating a de facto standard that developers have to divine based on the (conflicting) implementations. This isn't just some user-agent-specific logic, it actually affects how developers are allowed to run their sites (i.e., am I allowed to change my manifest URL? The spec doesn't say, I just have to try it and see if it breaks browsers.) So whatever the answer is, it should be specified and consistent across browsers.

I do like the idea of manifest URL being the key, for the reasons you said 1 and 2 (you can just point a store listing or admin install config at a manifest URL and it tells you everything you need to index and install the app).

But it has the significant drawback that developers can never change their manifest URL once the site is launched. We can possibly solve around that by adding an explicit ability to migrate users to a new manifest URL (which could be as simple as stating that a HTTP 301 redirect on the manifest URL says to update to the new location). But it would be simpler if we didn't tie the key to the manifest URL in the first place.

3: No ambiguity over whether two applications within the same origin/sharing the same start_url/sharing the same scope/claiming the same internal ID are the same app or different apps

That is true of any standardized solution. The ambiguity comes from the current reality of it not being specified.

alancutter commented 4 years ago

I think we should expect to need to add an ID migration mechanism anyway to cover changing origins. Being able to ping the manifest directly is extremely attractive and perhaps having to perform a migration to change your manifest URL is worth it.

mgiuca commented 4 years ago

Yeah, that's true. Did we have any other reasons (@alancutter) to propose the explicit id scheme, besides being able to migrate your manifest?

I suppose we should consider two separate use cases here:

  1. Once-in-awhile developer wanting to migrate their manifest URL.
  2. Manifest URL is versioned so it changes every time the manifest changes.

Doing an explicit migration is suitable for 1. But I don't think you'd want to do this for 2, otherwise you'd have to make your old manifests 301 to the new one every time. So this would probably preclude being able to version your manifests. Which as @jakearchibald said in 2018, is actually best practice (or would be, if it worked; at present it's best practice for everything but the manifest because of this problem).

benfrancis commented 4 years ago

@mgiuca wrote:

whatever the answer is, it should be specified and consistent across browsers.

I agree.

See also: https://github.com/w3c/manifest/issues/446 and https://github.com/w3c/manifest/issues/384.

The "Updating the manifest" section of the specification has been empty since 2016 when the same-origin constraint was dropped for manifest URLs and default scope was defined as "unbounded" (later changed) which made things more complicated.

Whatever solution is eventually specified for updates will obviously be influenced by what is used as the unique identifier for an app. Having a relatively stable manifest URI that can be fetched periodically seems like the obvious solution to me. When apps can have overlapping navigation scopes and start URLs can change, something needs to be stable.

Migrating manifest URIs via redirects could work for occasional changes to app structure, but as you suggest it could get unwieldy if the developer tries to change the manifest URL every time the manifest's content is updated for caching purposes. In practice it might be simpler for a developer to just treat a significantly restructured version of the app as a new app, and use other strategies for caching/versioning.

wanderview commented 4 years ago

Note, I expect we will need an identifier mechanism for service workers as well for similar use cases; e.g. migrating from one scope to another. It would be difficult for sites to manage the teardown of one service worker and migration to another without something like this.

Do you plan to make your proposal work for service workers as well?

The strawman I had been thinking of was something lile:

navigator.serviceWorker.register('sw.js', {
  scope: '/some/scope',
  token: 'my-origin-unique-token',
});

So if you call register again with the same token, but a different scope we would migrate the current service worker registration to the new scope.

Edit: Sorry if this was already discussed in this thread. I've only be lightly following until recently.

benfrancis commented 4 years ago

The strawman I had been thinking of was something lile:

navigator.serviceWorker.register('sw.js', {
  scope: '/some/scope',
  token: 'my-origin-unique-token',
});

A couple of thoughts:

  1. I would find it odd if a web application was identified by anything other than a Uniform Resource Identifier. Apart from being what makes the web the web, URIs make great origin-unique tokens! I would hope there's no need to invent another type of ID namespace like Play Store/App Store style app IDs.
  2. What's the latest thinking on the mapping between a service worker and an app?

It would be really neat if there was a 1:1 mapping between the two and app scope == service worker scope, then you could use the manifest URI as the unique identifier for both and update both navigation scope and service worker scope together in a single update. (There used to be a similar kind of mapping in the manifest, but the other way around.)

But my understanding is that isn't the case and it's currently possible to have multiple service workers per app or multiple apps per service worker, or have one and not the other. If the two technologies are entirely de-coupled then maybe they have to have their own mechanisms for identifying an installed application vs. an installed service worker.

mgiuca commented 4 years ago

I'd rather not tie any of this to service workers; it increases the complexity by an order of magnitude. (That's why we ended up removing service worker from the manifest; they are unrelated, and that's by design, as with all the pieces of the web platform, they are separate and composable.)

The way I view it, the service worker is an implementation detail of the application (something the user can't see or interact with at all), while the manifest is the user-facing concept of an application. While generally websites will want to have them at the same scope, you can for instance have a top-level SW scope but with lots of smaller-scoped app manifests. I remember discussions early on in the desktop PWA project on Chrome to have links open in the app if they were within the service worker scope, which I shot down because I don't think service worker scope should have any bearing on the way the user experiences the app. (In much the same way as the user shouldn't care about whether there's a proxy server in between the client and the real backend.)

Under that philosophy, I don't see there being a particular need for a SW-to-SW migration. If you just tear down the old SW and spin up a new one, you can just re-cache everything (or find some mechanism to transfer the cache so it doesn't have to be redownloaded). That's a very different problem to manifest migration, which can't be done in user-space because it involves changing the URL that installed OS-level "apps" are pointing to, and potentially informing the user that the application is changing.

I would find it odd if a web application was identified by anything other than a Uniform Resource Identifier. Apart from being what makes the web the web, URIs make great origin-unique tokens! I would hope there's no need to invent another type of ID namespace like Play Store/App Store style app IDs.

True. I like identifying things with URLs*. I would be OK with saying the "id" is a URL. But when we thought about it, the URL never actually gets resolved, so it would effectively just be an opaque string. If we did want to use URL syntax to express the ID, we shouldn't use the "https" scheme (since that implies it's an actual resolvable resource). We'd have to come up with our own scheme, like "webapp://example.com/user-specified-id". But in the id field, the origin would be implicit (since we can't let you specify the origin of your app's ID, it has to be the origin of your start_url/scope). So we may as well just make the id an opaque string, which if you like, can be formed into a URL like the above, but in practice, you'd never see the URL, and it would be easier to just state that "the unique identity of an application is the pair of (app origin, user-specified-id)", rather than inventing a whole new URL scheme.

*By the way, I'm avoiding use of the term "URI" simply because the URL Standard says that the term "URI" is deprecated in favour of "URL". I personally think it's useful having a distinction, but I fought this years ago, and gave up.

benfrancis commented 4 years ago

We'd have to come up with our own scheme, like "webapp://example.com/user-specified-id".

This is similar to what we did with Firefox OS, where we created URLs like webapp://1dd47458-abac-4637-b7e6-12c6e0ef9846. With hindsight I think creating a protocol scheme and namespace separate from the web was the biggest single technical mistake we made on the project, because over time it allowed applications to evolve into something which was missing many of the key benefits of the web, especially linkability. This is one of the key principles of "Progressive Web Apps".

*By the way, I'm avoiding use of the term "URI" simply because the URL Standard says that the term "URI" is deprecated in favour of "URL". I personally think it's useful having a distinction, but I fought this years ago, and gave up.

Yeah I was just using the terms to distinguish between a URI which identifies an app and a URL which can be resolved to locate its metadata, but what I'm advocating for is a manifest URL which serves both functions.

My understanding of the proposed use cases for an ID in this thread are:

  1. Uniquely identify a web app in a directory or app store
  2. Uniquely identify a web app when the metadata in its manifest is updated
  3. Sync installed web apps across devices, even if the server serves slightly different manifest metadata to different user agents

With the additional requirements:

  1. An ID which is guaranteed to be unique within its origin
  2. Ideally have backwards compatibility with user agents which have used manifest URL as an ID in the past
  3. Allow developers to change the URL of a web app manifest when needed and migrate to that new URL
  4. Enable caching a manifest and invalidating the cache of a manifest

Using the manifest URL to identify the app seems to me to fulfill all of these requirements. It can be used as a globally unique identifier (which is therefore also unique within its origin), can identify an app even when the contents of the manifest is updated or differs between user agents, allows cache control using cache headers and can be migrated to a new URL if necessary using HTTP redirects.

It also has the benefit that it doesn't require inventing a new URL scheme for a new non-web namespace for installed web applications and doesn't require an algorithm to derive the identity of an app from multiple inputs. And finally, it has the benefit of providing a potential simple update mechanism which the manifest specification still doesn't have a solution for.

alancutter commented 4 years ago

So if you call register again with the same token, but a different scope we would migrate the current service worker registration to the new scope.

What state is persistant that needs migrating for service workers? I was under the impression caches were origin scoped. I'm not super familiar with service workers so genuinely asking.

wanderview commented 4 years ago

What state is persistant that needs migrating for service workers? I was under the impression caches were origin scoped. I'm not super familiar with service workers so genuinely asking.

The version of the service worker that is active is part of the state of the app. The service worker lifecycle is designed to support keeping other storages in sync with your script state. But if you have separately identified service workers you lose the ability to keep them in sync. You now have to deal with possibly two service workers being in flight in various states at once. Its possible to deal with, but complex.

wanderview commented 4 years ago

Anyway, I'll file a separate issue for the service worker issue.

marcoscaceres commented 4 years ago

We are going to defer on the ID for now. We will pick this up again after CR.

dmurph commented 3 years ago

I see the following issues with using manifest url: 1) This locks developers into a CDN / the manifest url host. I can see a world where someone scraping manifests would see a ton of duplicate webapps here because they find links to lots of different manifest urls for the same app (versioned, or on different CDNs). There would all be separate apps 2) This prevents versioning of the manifest in the name 3) This makes it difficult to serve different manifests based on client hints like language, etc.

I think not having a developer-facing / obvious ID system also has some problems: 1) Sites accidentally break webapp updating by changing start_url or manifest_url w/o knowing that none of their old users won't get updated. This segments their users and there is no way to fix it. 2) Without a standard here different user agents do different things (chrome uses start_url, Android uses manifest_url, Firefox uses manifest_url?) 3) Hard to have two different webapps on the same origin that works for multiple user agents

We have already seen major sites break their webapp by changing the start_url or manifest_url without knowing that this breaks things.

So in general, I think this is really important to fix. I think manifest_url as the key is "ok", but I'd much rather allow the manifest_url to change for a webapp to avoid the issues up top (do you see any other issues?)

I'm thinking maybe we can use a unique-per-origin ID that a manifest can set. It can be a string of anything, and the default value could match either start_url or manifest url to prevent breaking on one system or another (someone is going to break).

WDYT?

mgiuca commented 3 years ago

Yes that's what we've been thinking of.

unique-per-origin ID that a manifest can set

I think the best way (i.e., most consistent with how other manifest keys like this work) to express this is that the ID is a URL that must be same-origin as scope*. This URL is resolved against the manifest URL, like every other URL in the spec. This URL is never requested, it's just used as an identifier.

Making it a URL like this and requiring same origin gives it a natural uniqueness per origin.

This means if your manifest is on a different origin to your start/scope, you must specify the ID as an absolute URL (same as all the other URLs). And it means if your id is path-relative, moving your manifest would change the ID, so you have to be careful to keep the ID stable when moving the manifest. We would recommend always specifying the ID as path-absolute (e.g., "/my-id").

Another approach is to make it relative to the origin of the scope, so you can just specify the ID as a string and it will do as expected. That would create less confusion for developers, at a cost of being different to how all the other URLs resolve. That's a trade-off we could make.

* Note: I said "same-origin as scope" here, not "within scope" which is more typical. That's to prevent sites from accidentally being locked in and unable to change their scope. For example, if ID was "/foo" and scope was "/", they would be unable to change scope to "/bar" without breaking the ID. Since there is no technical reason for the ID to be within scope (since we never actually navigate to that URL), it can just be same origin.

could match either start_url or manifest url to prevent breaking on one system or another (someone is going to break).

Ideally, the default would be scope. Scope is the best identifier of an app at the moment; it's the least likely to change over time and it's usually a containing URL for both start and manifest. That would break both desktop and mobile Chrome's current representation, but since we're going to cause breakage, maybe we should just do it globally?

If not scope, I don't have a strong opinion about whether start_url or manifest URL is the best default.

dominickng commented 3 years ago

I suspect manifest URL is less likely to change over time than start_url, should we need to fall back to one of those two.

alancutter commented 3 years ago

If there is no manifest should that be some kind of "null" ID within the origin?

benfrancis commented 3 years ago

If the ID for a web app is a URL inside a manifest and the manifest URL itself can change: 1 How would updates be handled? (Especially if in future web apps can be installed independently of a document.)

  1. What happens if two web app manifests provide the same ID?
alancutter commented 3 years ago
  1. The user agent can fetch the start URL to retrieve the latest manifest data.
  2. If two web apps provide the same ID (on the same origin) then they are the same web app.
dmurph commented 3 years ago

I do think that there is one big downside for using manifest_url as the ID - this means that a manifest wouldn't be inherently 'packaged' by itself. Like - you couldn't install an app just from a manifest without that manifest url (or the id being specified).

If everyone specifies an ID, this I guess isn't that big of a deal - but it is difficult to fake for the 'webapp'ing that current browsers do. Right now, you can create a fake manifest for a site and just set the start_url to the url that is being shown, and bam, webapp. But if manifest_url becomes the unique ID, and systems are designed around that, then that becomes more complicated.

for the questions above:

  1. I'm thinking updates would happen when the browser encountered a manifest, where the ID of the manifest matches the id of an existing manifest of that origin. The traditional way browser encounter the manifest is to see it when a page is loaded - but I could imagine that it could be provided another way.
  2. As Alan said, then they would be the same webapp. I think Chrome's behavior is to apply the last-seen manifest as the updated manifest.
benfrancis commented 3 years ago

@dmurph wrote:

I do think that there is one big downside for using manifest_url as the ID - this means that a manifest wouldn't be inherently 'packaged' by itself. Like - you couldn't install an app just from a manifest without that manifest url (or the id being specified).

I personally think that's a good thing because trust in the origin a manifest was retrieved from is surely an important factor in the implict permissions a user grants by installing a web application? It also makes the app more linkable and discoverable if the ID actually dereferences to something.

it is difficult to fake for the 'webapp'ing that current browsers do. Right now, you can create a fake manifest for a site and just set the start_url to the url that is being shown, and bam, webapp. But if manifest_url becomes the unique ID, and systems are designed around that, then that becomes more complicated.

For a hack like a fake manifest, which won't be following the specification anyway, could browsers generate a special cased local URL like chrome://apps/myfakeapp.webmanifest ?

they [two manifests providing the same ID] would be the same webapp

That would presumably make https://foo.github.io/repo1/app1.webmanifest and https://foo.github.io/repo2/app2.webmanifest (or https://google.com/calendar/app.webmanifest and https://google.com/mail/app.webmanifest) the same app, if they provided the same ID.

That arguably isn't a huge issue as the origin is ultimately the trust boundary, but it could be a bit of a footgun.

dmurph commented 3 years ago

@dmurph wrote:

I do think that there is one big downside for using manifest_url as the ID - this means that a manifest wouldn't be inherently 'packaged' by itself. Like - you couldn't install an app just from a manifest without that manifest url (or the id being specified).

I personally think that's a good thing because trust in the origin a manifest was retrieved from is surely an important factor in the implict permissions a user grants by installing a web application? It also makes the app more linkable and discoverable if the ID actually dereferences to something.

I guess I don't see the host of the manifest being the "trusted" origin, I see it being the origin of the start_url / the implied scope. Technically someone right now can host a manifest (B.com) that lists A.com as the start url, and that just works. Arguably maybe not a great thing to be valid, but that's a separate discussion I think.

it is difficult to fake for the 'webapp'ing that current browsers do. Right now, you can create a fake manifest for a site and just set the start_url to the url that is being shown, and bam, webapp. But if manifest_url becomes the unique ID, and systems are designed around that, then that becomes more complicated.

For a hack like a fake manifest, which won't be following the specification anyway, could browsers generate a special cased local URL like chrome://apps/myfakeapp.webmanifest ?

yeah that might work

they [two manifests providing the same ID] would be the same webapp

That would presumably make https://foo.github.io/repo1/app1.webmanifest and https://foo.github.io/repo2/app2.webmanifest (or https://google.com/calendar/app.webmanifest and https://google.com/mail/app.webmanifest) the same app, if they provided the same ID.

That arguably isn't a huge issue as the origin is ultimately the trust boundary, but it could be a bit of a footgun.

Yeah, but fixable by the developers at least. I think that is a much easier problem to avoid than the current situation, where they can unknowingly segment their users w/o obvious problems initially

glennhartmann commented 3 years ago

I'm pretty late to the party, but I have a few thoughts.

  1. Overall, iiuc, the main motivation for having an explicit ID instead of using Manifest URL is to allow the Manifest URL to change (including for versioning). It's been mentioned that the migration path for the app to find a new Manifest URL will be to load the Start URL and find the Manifest URL from the \<link> tag. So my question is, what if a developer changes both URLs? For example, if they migrate their whole app to a new domain or new subdirectory? Haven't we effectively just changed the problem to "yes, we can now move the Manifest URL, but only if it's guaranteed that the original Start URL remains up-to-date and accessible forever"? And to emphasize, this means the original Start URL. It's possible that existing apps that have already changed their Start URLs still have straggler users who haven't updated to the new Start URL yet. If they now move their manifest, will the straggler users be able to find the new manifest?

  2. Speaking of moving an app to a new domain, is that handled by this proposal? There's a lot of talk of unique-per-origin IDs. Does that mean moving to a new origin still qualifies as "making a whole new separate app"?

  3. I'm wondering about the migration strategy from the current world. If we say that an empty ID will default to the Manifest URL, then all current apps already have a default ID assigned. How do they migrate from this default ID to their new desired explicit ID without losing all their current users?

  4. It seems to me that there's almost zero benefit of an app specifying a relative ID. Possibly even a negative benefit, since some devs might think specifying an ID is protecting them and allowing them to move their Manifest URL, and they will end up accidentally breaking their app when the ID changes after a manifest move. What's the point of allowing a relative ID? Why not make absolute IDs mandatory?

@dmurph wrote:

I see the following issues with using manifest url:

  1. This locks developers into a CDN / the manifest url host. I can see a world where someone scraping manifests would see a ton of duplicate webapps here because they find links to lots of different manifest urls for the same app (versioned, or on different CDNs). There would all be separate apps
  2. This prevents versioning of the manifest in the name
  3. This makes it difficult to serve different manifests based on client hints like language, etc.

Could you elaborate on (3)? Do you mean that it would be better to have a set of manifests, one per language, for example, and they would all have the same ID (making them all parts of the same app), and then the start_url page would dynamically choose which to point to in the \<link> tag based on client hints? Is that much more difficult than serving dynamic manifest contents from the same URL based on client hints?

Also, it seems potentially weird to have multiple manifests all be part of the same app. While there are legitimate use-cases (like translation), it opens up a potentially very confusing situation where completely different things will all be considered the "same" app (they could customize anything in the manifest - start url, name, color schemes, even scope, and we'd still consider it "the same app"). Yes, technically this is already possible by hosting a dynamic manifest, but this makes it a more explicitly "allowed" strategy, and one that could even be done accidentally.

dmurph commented 3 years ago

@glennhartmann wrote:

I'm pretty late to the party, but I have a few thoughts.

  1. Overall, iiuc, the main motivation for having an explicit ID instead of using Manifest URL is to allow the Manifest URL to change (including for versioning). It's been mentioned that the migration path for the app to find a new Manifest URL will be to load the Start URL and find the Manifest URL from the tag. So my question is, what if a developer changes both URLs? For example, if they migrate their whole app to a new domain or new subdirectory? Haven't we effectively just changed the problem to "yes, we can now move the Manifest URL, but only if it's guaranteed that the original Start URL remains up-to-date and accessible forever"? And to emphasize, this means the original Start URL. It's possible that existing apps that have already changed their Start URLs still have straggler users who haven't updated to the new Start URL yet. If they now move their manifest, will the straggler users be able to find the new manifest?

I think the desire is to allow developers to change the start url & the manifest url. They would have to deal with the case that an old start url is still registered for various users forever, so they would have to handle that somehow. I'm not sure what the best route would be here - I'm guessing they have to somehow serve the new manifest on the old start url, OR they can redirect & serve the new manifest on the new start_url, and since the IDs will match up, then it can update.

  1. Speaking of moving an app to a new domain, is that handled by this proposal? There's a lot of talk of unique-per-origin IDs. Does that mean moving to a new origin still qualifies as "making a whole new separate app"?

2) cross-origin migration is a non-goal here. But if we make the IDs unique (instead of semi-unique), then that might be easier? I'm not trying to tackle this problem right now, we dont' have requests for it, we only have people right now trying to create multiple PWAs in the same domain & struggling, or people updating PWAs & having new start_urls, which accidentally segments their userbase.

  1. I'm wondering about the migration strategy from the current world. If we say that an empty ID will default to the Manifest URL, then all current apps already have a default ID assigned. How do they migrate from this default ID to their new desired explicit ID without losing all their current users?

They would have to use their old manifest url as their ID, forever. Or whatever we have as default. I guess we could have some custom spec language here around "if you didn't have an ID and you set one, then that is the ID, as long as the manifest url matches"?, but that might be complicated. Open to thinking about that though.

  1. It seems to me that there's almost zero benefit of an app specifying a relative ID. Possibly even a negative benefit, since some devs might think specifying an ID is protecting them and allowing them to move their Manifest URL, and they will end up accidentally breaking their app when the ID changes after a manifest move. What's the point of allowing a relative ID? Why not make absolute IDs mandatory?

Sure, I don't mind them being absolute / globally unique. It seems weird though as basically prepending the origin to the id would basically make it unique, so we could just do that for them, and say it only has to be unique for the origin. Question - how would the ID change after the manifest move? The ID must stay the same to move the manifest & not break people. Not sure why absolute is necessary here.

@dmurph wrote:

I see the following issues with using manifest url:

  1. This locks developers into a CDN / the manifest url host. I can see a world where someone scraping manifests would see a ton of duplicate webapps here because they find links to lots of different manifest urls for the same app (versioned, or on different CDNs). There would all be separate apps
  2. This prevents versioning of the manifest in the name
  3. This makes it difficult to serve different manifests based on client hints like language, etc.

Could you elaborate on (3)? Do you mean that it would be better to have a set of manifests, one per language, for example, and they would all have the same ID (making them all parts of the same app), and then the start_url page would dynamically choose which to point to in the tag based on client hints? Is that much more difficult than serving dynamic manifest contents from the same URL based on client hints?

Also, it seems potentially weird to have multiple manifests all be part of the same app. While there are legitimate use-cases (like translation), it opens up a potentially very confusing situation where completely different things will all be considered the "same" app (they could customize anything in the manifest - start url, name, color schemes, even scope, and we'd still consider it "the same app"). Yes, technically this is already possible by hosting a dynamic manifest, but this makes it a more explicitly "allowed" strategy, and one that could even be done accidentally.

Regarding my 3) - I think you're right, and I don't like that use case anymore. I'm a bigger fan of making the manifest multi-lingual, I think there are proposals here. I don't there there should be multiple manifests, just one per app.

mgiuca commented 3 years ago

They would have to use their old manifest url as their ID, forever. Or whatever we have as default. I guess we could have some custom spec language here around "if you didn't have an ID and you set one, then that is the ID, as long as the manifest url matches"?, but that might be complicated. Open to thinking about that though.

Yes. The point of having a hard-specified default is that site authors know exactly what their site used to default to, so if they want to change the thing that the default is based off (e.g., the manifest URL), they can set the ID to exactly the string that used to be the default, to avoid their ID changing.

We can put a non-normative note about this, but we don't need any normative text around the temporal changes to a manifest file.

Also, it seems potentially weird to have multiple manifests all be part of the same app.

Well. "Multiple manifests" is a bit hard to define (if a manifest changes its content or its URL, is that "multiple manifests"?). Essentially the entire point of this ID is to formally identify when a manifest change represents a new app, versus a mutation of an existing app.

Using a different manifest URL is the currently recommended and only viable way to provide localized manifest metadata. So we have to support that, unless we want to block on #676 (properly supporting localization). I think this works fine: you would make all of your different-locale manifests have the same ID, so they all represent the same app. Whichever manifest was served at install time determines what language you see. That way, if the user changes their language, and the start URL starts pointing at another language's manifest URL, the browser's updater will go "aha, I'll update to a new version of the manifest" as opposed to "that app is not installed". (This is exactly the point of having an ID, so we can distinguish those cases.)

Going back to what @benfrancis said, the same answer applies:

What happens if two web app manifests provide the same ID?

The whole point of the ID is so that we know when "two web app manifests" represent two different apps versus two different versions of the same app.

benfrancis commented 3 years ago

@mgiuca wrote:

Using a different manifest URL is the currently recommended and only viable way to provide localized manifest metadata. So we have to support that, unless we want to block on #676 (properly supporting localization).

That's not strictly correct. The specification also mentions that servers can use the "Accept-Language" header to provide the user with a manifest in their preferred language. It's perfectly valid for the same resource at the same URL to have different representations as a result of content negotiation like this, it doesn't make it a separate resource. An HTTP URL identifies the resource, not its representation.

Either way, I don't see this as a problem. If the user installs the French version of an app, they presumably want to continue using the French version of the app when the manifest is updated, either based on the default language preference set in the user agent or by manual selection via a query string. (I would argue this is "properly supporting localization" and as I understand it was an intentional design decision, but I will read and comment on the other issue about that.)

Edit: I see you wrote a whole document on this topic, so maybe you just forgot ;)

Well. "Multiple manifests" is a bit hard to define (if a manifest changes its content or its URL, is that "multiple manifests"?).

As above, if the URL changes then yes. If the content changes but the URL remains the same then no. This works in either localisation case because either you're explicitly requesting a resource which is a manifest for a French app, or you're requesting a multi-lingual resource and asking for the French representation of it.

The whole point of the ID is so that we know when "two web app manifests" represent two different apps versus two different versions of the same app.

I would argue the manifest URL can already provide this. The only compelling reason I've heard so far for adding an additional identifier URL is where people are using version URLs for manifests in a CDN. I have to be honest that this isn't a problem I've ever come across, but I know this is an approach that Google recommends so maybe it's more common and a harder problem than I realise.

Personally as a web developer I'm used to the idea that a URI identifies a resource and it has always made sense to me that a web app be identified by its manifest URL. If a resource is superceded by a new resource (as opposed to a new version of the old one) I would have thought the normal practice is just to redirect to it. As a user agent implementer I would far rather just fetch the manifest URL to check for updates than have to look for the start URL, fetch the start URL, parse the HTML, look for the manifest link and then fetch the manifest. Especially if the start URL can change too.

@dmurph wrote:

we only have people right now trying to create multiple PWAs in the same domain & struggling, or people updating PWAs & having new start_urls, which accidentally segments their userbase.

IIUC then using the manifest URL alone as an identifier rather than some combination of manifest URL, start URL, scope and content then the problem of accidentally segmenting users by changing the start_url would go away? Multiple PWAs per domain shouldn't be a problem either. Or am I missing something?

glennhartmann commented 3 years ago

@dmurph wrote:

I think the desire is to allow developers to change the start url & the manifest url. They would have to deal with the case that an old start url is still registered for various users forever, so they would have to handle that somehow. I'm not sure what the best route would be here - I'm guessing they have to somehow serve the new manifest on the old start url, OR they can redirect & serve the new manifest on the new start_url, and since the IDs will match up, then it can update.

Right, I guess mainly what I'm wondering is whether this is that much of an improvement over using Manifest URL. If we say that manifest url is the ID, then start url is already trivially updatable, like any other manifest attribute. Moving the manifest is also doable via HTTP 301 redirect. The main drawback afaict is that it requires an explicit action by the developer, and continued control or maintenance over the original manifest url.

It seems to me that the new proposal (unless we come up with a better migration process) has similar explicit action and maintenance required. We're just changing the problem for developers from "we need to redirect the old manifest URL" to "we need to make sure all previous start URLs continue serving content and point to the current manifest URL". Either way moving the start URL and manifest is doable, but requires explicit thought and work to get it right.

To be clear, I'm not against the idea of an explicit ID, I just want to make sure it's buying us as much of a benefit as we think it is, and enough to justify the cost of implementation.

There are a few benefits I can think of, but I'm not sure how big they are:

  1. changing static-hosted contents may be easier than issuing an HTTP 301 redirect in some cases
  2. getting the migration wrong with an explicit ID is possibly more fixable than getting it wrong with a manifest URL


@dmurph wrote:

  1. cross-origin migration is a non-goal here. But if we make the IDs unique (instead of semi-unique), then that might be easier? I'm not trying to tackle this problem right now, we dont' have requests for it, we only have people right now trying to create multiple PWAs in the same domain & struggling, or people updating PWAs & having new start_urls, which accidentally segments their userbase.

Ok, gotcha.


@dmurph wrote:

  1. I'm wondering about the migration strategy from the current world. If we say that an empty ID will default to the Manifest URL, then all current apps already have a default ID assigned. How do they migrate from this default ID to their new desired explicit ID without losing all their current users?

They would have to use their old manifest url as their ID, forever. Or whatever we have as default. I guess we could have some custom spec language here around "if you didn't have an ID and you set one, then that is the ID, as long as the manifest url matches"?, but that might be complicated. Open to thinking about that though.

Makes sense, thanks.


@dmurph wrote:

  1. It seems to me that there's almost zero benefit of an app specifying a relative ID. Possibly even a negative benefit, since some devs might think specifying an ID is protecting them and allowing them to move their Manifest URL, and they will end up accidentally breaking their app when the ID changes after a manifest move. What's the point of allowing a relative ID? Why not make absolute IDs mandatory?

Sure, I don't mind them being absolute / globally unique. It seems weird though as basically prepending the origin to the id would basically make it unique, so we could just do that for them, and say it only has to be unique for the origin. Question - how would the ID change after the manifest move? The ID must stay the same to move the manifest & not break people. Not sure why absolute is necessary here.

Sorry, I think I was responding specifically to this:

@mgiuca wrote:

I think the best way (i.e., most consistent with how other manifest keys like this work) to express this is that the ID is a URL that must be same-origin as scope*. This URL is resolved against the manifest URL, like every other URL in the spec. This URL is never requested, it's just used as an identifier.

Making it a URL like this and requiring same origin gives it a natural uniqueness per origin.

This means if your manifest is on a different origin to your start/scope, you must specify the ID as an absolute URL (same as all the other URLs). And it means if your id is path-relative, moving your manifest would change the ID, so you have to be careful to keep the ID stable when moving the manifest. We would recommend always specifying the ID as path-absolute (e.g., "/my-id").

Which I guess we haven't actually landed on.


@mgiuca wrote:

Also, it seems potentially weird to have multiple manifests all be part of the same app.

Well. "Multiple manifests" is a bit hard to define (if a manifest changes its content or its URL, is that "multiple manifests"?). Essentially the entire point of this ID is to formally identify when a manifest change represents a new app, versus a mutation of an existing app.

Using a different manifest URL is the currently recommended and only viable way to provide localized manifest metadata. So we have to support that, unless we want to block on #676 (properly supporting localization). I think this works fine: you would make all of your different-locale manifests have the same ID, so they all represent the same app. Whichever manifest was served at install time determines what language you see. That way, if the user changes their language, and the start URL starts pointing at another language's manifest URL, the browser's updater will go "aha, I'll update to a new version of the manifest" as opposed to "that app is not installed". (This is exactly the point of having an ID, so we can distinguish those cases.)

Going back to what @benfrancis said, the same answer applies:

What happens if two web app manifests provide the same ID?

The whole point of the ID is so that we know when "two web app manifests" represent two different apps versus two different versions of the same app.

Right, this all seems reasonable. My main concern was developers accidentally (via copy/paste error, probably) specifying the same ID for multiple different apps, which would result in a pretty weird state. I'm probably overthinking this point, though, as it seems likely to be an infrequent case.


@benfrancis wrote:

Using a different manifest URL is the currently recommended and only viable way to provide localized manifest metadata. So we have to support that, unless we want to block on #676 (properly supporting localization).

That's not strictly correct. The specification also mentions that servers can use the "Accept-Language" header to provide the user with a manifest in their preferred language. It's perfectly valid for the same resource at the same URL to have different representations as a result of content negotiation like this, it doesn't make it a separate resource. An HTTP URL identifies the resource, not its representation.

Either way, I don't see this as a problem. If the user installs the French version of an app, they presumably want to continue using the French version of the app when the manifest is updated, either based on the default language preference set in the user agent or by manual selection via a query string. (I would argue this is "properly supporting localization" and as I understand it was an intentional design decision, but I will read and comment on the other issue about that.)

I tend to agree with "If the user installs the French version of an app, they presumably want to continue using the French version of the app when the manifest is updated". But it's also true that if the manifest being served depends on the user's IP address, for example, then while travelling, they could end up getting install prompts for a different language of an app they already have installed, because the user agent has no way of knowing that these are just two different localizations of "the same app".

dmurph commented 3 years ago

@glennhartmann wrote:

@dmurph wrote:

I think the desire is to allow developers to change the start url & the manifest url. They would have to deal with the case that an old start url is still registered for various users forever, so they would have to handle that somehow. I'm not sure what the best route would be here - I'm guessing they have to somehow serve the new manifest on the old start url, OR they can redirect & serve the new manifest on the new start_url, and since the IDs will match up, then it can update.

Right, I guess mainly what I'm wondering is whether this is that much of an improvement over using Manifest URL. If we say that manifest url is the ID, then start url is already trivially updatable, like any other manifest attribute. Moving the manifest is also doable via HTTP 301 redirect. The main drawback afaict is that it requires an explicit action by the developer, and continued control or maintenance over the original manifest url.

It seems to me that the new proposal (unless we come up with a better migration process) has similar explicit action and maintenance required. We're just changing the problem for developers from "we need to redirect the old manifest URL" to "we need to make sure all previous start URLs continue serving content and point to the current manifest URL". Either way moving the start URL and manifest is doable, but requires explicit thought and work to get it right.

To be clear, I'm not against the idea of an explicit ID, I just want to make sure it's buying us as much of a benefit as we think it is, and enough to justify the cost of implementation.

There are a few benefits I can think of, but I'm not sure how big they are:

  1. changing static-hosted contents may be easier than issuing an HTTP 301 redirect in some cases
  2. getting the migration wrong with an explicit ID is possibly more fixable than getting it wrong with a manifest URL

I guess, in the case where a website accidentally assumes they can just reference a different manifest link for an update, and that doesn't obviously break anything at first, is still an issue. We have already had partners make this mistake with start_url on desktop and they can't fix it (half of their population installed after the change, half before the change. So they have two separate apps now they have to maintain and people get confused because they can install both, etc).

Maybe we position the id as a way to fix this type of situation. original_manifest_url, or manifest_url_override, or maybe just id is fine. But it's basically a way for a website to fix this situation if it happens, instead of encouraging all people to use an id? Then your 'copy-paste' mistake case won't be as much of an issue?

My preference would be to:

BUT I'm also fine with framing it as a "fixing" field, and not encouraging developers to use it unless they have broken their population by moving their manifest hosting location.

WDTY?

benfrancis commented 3 years ago

@dmurph wrote:

I guess, in the case where a website accidentally assumes they can just reference a different manifest link for an update, and that doesn't obviously break anything at first, is still an issue. We have already had partners make this mistake with start_url on desktop and they can't fix it (half of their population installed after the change, half before the change. So they have two separate apps now they have to maintain and people get confused because they can install both, etc).

I can see why the segmentation problem exists when the identifier of an app is a tuple of manifest URL + start_url + something else (this is not something a developer would intuitively expect). But if the identifier was the manifest URL alone then is there any evidence to suggest that developers do assume different manifest URLs will be treated as the same app by user agents?

The reason that I ask is that the cost of fixing this (potentially hypothetical) problem by adding an additional identifier is that updates could become a lot more complex. It's much simpler to fetch a fixed manifest URL to look for updates than to have to figure out what document belongs to the app (bearing in mind start_url and scope can change), parse the manifest link relation from that document's HTML and then fetch the manifest.

I suggest the underlying problem here is just that the specification currently doesn't specify what the identifier of an app is or how apps are updated so developers are left to guess what user agents use to identify their app and each user agent may do something different.

dmurph commented 3 years ago

@benfrancis wrote

The reason that I ask is that the cost of fixing this (potentially hypothetical) problem by adding an additional identifier is that updates could become a lot more complex. It's much simpler to fetch a fixed manifest URL to look for updates than to have to figure out what document belongs to the app (bearing in mind start_url and scope can change), parse the manifest link relation from that document's HTML and then fetch the manifest

The algorithm that we use is to basically visit any manifest links we see and then update if the identifier matches (for chrome desktop this is start_url). This code path used for install detection as well (and for setting theme color, etc, manifests are always supposed to be visited) , so it's not too much of added complexity for us to check all manifests we see because we do that anyways. Not sure what android does.

@benfrancis wrote

I suggest the underlying problem here is just that the specification currently doesn't specify what the identifier of an app is or how apps are updated so developers are left to guess what user agents use to identify their app and each user agent may do something different.

I think this is exactly right. Declaring the manifest_url to be the authoritative 'ID' would also work, as long as all user agents behave as such (i.e. we can get that into the spec). Would you be OK with the conclusion here to use 'manifest_url' as the unique identifier for a web app?

benfrancis commented 3 years ago

@dmurph wrote:

The algorithm that we use is to basically visit any manifest links we see and then update if the identifier matches (for chrome desktop this is start_url). This code path used for install detection as well (and for setting theme color, etc, manifests are always supposed to be visited) , so it's not too much of added complexity for us to check all manifests we see because we do that anyways. Not sure what android does.

Ah I see. Yes unfortunately I don't think start_url makes the best identifier because it's the most likely to change. I tried using scope (+ origin) in the past for Firefox OS, which is a bit better but still has problems.

I'm not sure what Fenix currently does? Do manifests on Android currently ever get updated once they are installed?

I currently have a slightly different use case where I'd ideally like to be able to install a single web app to a kiosk runtime remotely by its manifest URL, without it being installed from browser chrome on the device itself. I'm hoping that https://github.com/w3c/manifest/pull/834 might make that possible by removing the dependency on a document URL. In this case manifest link relations may never actually be followed, which makes the approach you describe above for updates a bit tricky. I recognise this is not the core intended use case for a manifest, but it might also be relevant to installing a web app from an app store.

Would you be OK with the conclusion here to use 'manifest_url' as the unique identifier for a web app?

Yes, that's what I'm suggesting.

andreban commented 3 years ago

We've been using the manifest-url as a unique identifier for Gulliver for a few years now. We had more issues with developers versioning the Manifest URL early on, but this has significantly improved since then. There are still issues with duplication in some cases though, but those mostly come from Gulliver allowing non-installable apps being added (eg: Manifests hosted via http).

alancutter commented 3 years ago

I'm okay with standardising on origin + manifest_url being the identifier because we can always transition later to an explicit id field that defaults to manifest_url should we find that devs need to move their manifest around.

mgiuca commented 3 years ago

I think Chrome desktop is going to be the main pain point here, since that's the one that doesn't use manifest URL. If @dmurph and @alancutter are happy with migrating Chrome desktop over to manifest, I think we'll all be in a much better place (and yes, we can live with that and add id field later if there are still issues).

FWIW, the issue with just having manifest as the ID and not an explicit id field is that it restricts what people are able to do with their site layout. Even if developers are no longer having *this* issue due to versioning their manifest URL, there is a reason developers like to version assets (for caching), so forcing them to not version their manifest URL could be causing other problems that we don't have much visibility into. That is why I originally proposed the id field.

dmurph commented 3 years ago

Does anyone here have experience with customers who want to do experimenting with manifest attributes (like, if a site wants to have 5% of the population have 'minimal-ui' instead of 'standalone')? This is something where I could see the manifest_url constraint being a hinderance, as the website now has to dynamically change their manifest file server-side based on probably cookies in the request header. Not impossible, but very constraining - I could imagine this being a primary use-case for introducing the id-style field.

alancutter commented 3 years ago

I know of Construct 3 from an old bug. Their app URLs appear to be versioned. https://editor.construct.net/ has manifest URL https://editor.construct.net/r210-2/appmanifest.json. Back when that bug was active their manifest URL was https://editor.construct.net/r131/appmanifest.json.

Edit: Google Keep is another example of a versioned manifest: https://ssl.gstatic.com/keep/manifest/v2.webmanifest