w3c / ServiceWorker

Service Workers
https://w3c.github.io/ServiceWorker/
Other
3.63k stars 316 forks source link

Consider producing distinct objects for the same entity to enable garbage collection #416

Closed dominiccooney closed 8 years ago

dominiccooney commented 10 years ago

ServiceWorker APIs vend various objects... ServiceWorker, ServiceWorkerRegistration, Cache. It is possible to retrieve the same thing multiple times.

This creates an issue for implementations, which must be conservative when reclaiming these objects, because they may be re-retrieved later. Whether it is safe to reclaim an object is not decidable in general because of the halting problem, it's an interactive system, etc.

For example:

self.caches.create('seldom-used').then(function(cache) {
    cache.kiss = 'death';
  });

The implementation cannot reclaim 'cache' as long as the page is open, because it can't decide if the page will come along later and do self.caches.get('seldom-used') expecting to find the kiss of death on the object.

(I'm not talking about the cache entity here, but the page's JavaScript wrapper, the thing that the 'cache' variable is referring to.)

The same issue happens with registrations (because of getRegistration) and ServiceWorkers (through getRegistration and the active, etc. properties.) Other APIs avoid this problem by returning a distinct object instead of modelling the entity directly. For example, IndexedDB returns new connection objects to the same database entity.

annevk commented 10 years ago

Yeah, I raised this several times in different issues. We need to be clear about the lifetime of objects. I pointed to the Notifications API as an example of vending new objects (clones) all the time.

jakearchibald commented 10 years ago

Things that return registrations:

Things that return serviceworker instances:

Things that return clients:

Things that return requests:

Things that return responses:

Things that return caches:

annevk commented 10 years ago

I think for all of those a many-to-one representation makes sense. Many JavaScript objects for one conceptual underlying object.

annevk commented 10 years ago

Well, and if you put things in the cache I would still expect that to copy. Since if you modify the Request somehow you put in, I would not expect the request in the cache to be modified.

jakearchibald commented 10 years ago

You can mostly compare the equivalence of a registration using scope, with this exception:

navigator.serviceworker.ready.then(function(reg) {
  return reg.unregister().then(function() {
    return nagivator.serviceWorker.register(reg.scriptURL, {
      scope: reg.scope
    });
  }).then(function(newReg) {
    newReg.scope == reg.scope;
  });
});

In the above the scopes are the same but the registrations are different. We used == between registrations to explain https://github.com/slightlyoff/ServiceWorker/issues/396#issuecomment-50725155, but maybe there isn't a real-world usecase.

If I felt like I was getting an object representing the state of registration, I probably wouldn't expect == to work, but since registrations are live it makes it more of a grey area.

navigator.serviceWorker.controller not equalling registration.active makes me a little uncomfortable.

getClones explicitly returns clones. I'm happy with fetchEvent.client being a different object per request. We can reintroduce client.id later if we need it.

Different objects each time.

Different objects each time.

Different objects each time works for me.

inexorabletash commented 10 years ago

Other APIs avoid this problem by returning a distinct object instead of modelling the entity directly. For example, IndexedDB returns new connection objects to the same database entity.

Aside: IDB is actually specified to return the same object in other places, e.g. IDBTransaction.prototype.objectStore(name) is specified to return the same IDBObjectStore instance each time. This presumably suffers from the o.kiss='death' problem as well. I dimly recall raising this issue with the Blink binding gurus a couple of years back and everyone shrugged. (Are DOM nodes immune somehow?)

Anyway... yes, real issue; GC should not be detectable and all that.

annevk commented 10 years ago

DOM nodes are not immune but do not suffer from this either. They cannot be dropped and then revived.

jakearchibald commented 10 years ago

Let's avoid object equivalence as much as possible.

I think this is an easy sell for objects vended by a method: caches.get, clients.getAll, serviceWorker.getRegistrations etc, all return new instances every time.

Properties should return the same instance per get unless their value explicitly changes. So navigator.serviceWorker.controller === navigator.serviceWorker.controller.

Promise-properties should behave like properties and vend the same instance unless the value changes. So (await navigator.serviceWorker.ready) === (await navigator.serviceWorker.ready) unless the value changes in between, but (await navigator.serviceWorker.ready) != (await navigator.serviceWorker.getRegistration()), as the latter is vended from a method.

Let's be brave and say object-properties on objects vended from methods don't need to be equivalent. So (await serviceWorker.register('/hi')).installing != (await serviceWorker.register('/hi')).installing, even if they have the same postMessage destination.

This leaves us with (await navigator.serviceWorker.ready).active != navigator.serviceWorker.controller, which is the bit I'm most uncomfortable about, but I guess we can change that if it trips people up.

@coonsta: does that sound healthy?

annevk commented 10 years ago

For the message port system that sounds rather trippy by the way.

jakearchibald commented 10 years ago

More trippy than two connections to the same database being non-equivalent?

annevk commented 10 years ago

It's just hard. But then the setup is already hard. This is just an extra layer of iteration before posting the messages I guess.

ErikCorryGoogle commented 10 years ago

The current (august 22) draft at https://slightlyoff.github.io/ServiceWorker/spec/service_worker/#document-context has a lot of text like:

  1. Set scope to the result of running the URL serializer with scope.
  2. Return scope. Each Document object must have a separate object for its ServiceWorkerRegistration's scope attribute.

Since the URL serializer returns a string the "separate object" clause looks unnecessary. Strings are primitive objects that don't have object identity, and you can't attach properties to them.

annevk commented 10 years ago

Agreed.

ehsan commented 10 years ago

This creates an issue for implementations, which must be conservative when reclaiming these objects, because they may be re-retrieved later. Whether it is safe to reclaim an object is not decidable in general because of the halting problem, it's an interactive system, etc.

For example:

self.caches.create('seldom-used').then(function(cache) { cache.kiss = 'death'; });

The implementation cannot reclaim 'cache' as long as the page is open, because it can't decide if the page will come along later and do self.caches.get('seldom-used') expecting to find the kiss of death on the object.

I don't think that this is true in general. In Gecko for example, we cache these expando properties on the native object, and we let the wrapper be GCed. If and when we need to recreate the wrapper, we reinsert the cached dynamic properties on it.

Since this is something that you need for APIs such as Document.getElementById, do we really want to try to address this in the service worker API?

mfalken commented 9 years ago

Are distinct objects for the same entity expected to be in sync? E.g., must two ServiceWorker objects representing the same entity always have equal .state? In Chrome, distinct objects get individually updated, so this happens:

register().then(function(r) {
  r.installing.addEventListener('statechange', function(e) {
    e.source.state == 'installed';
    r.waiting.state == 'installing';
 });
});

If it's required, I think the most straightforward implementation for Chrome would be to try to reuse the JS objects after all (although not requiring JS equality is still nice for gc). More details on #622

jakearchibald commented 9 years ago

Agreed, being able to === is more useful than I first thought too, especially when comparing say a message event's source to the SW controlling the document.

annevk commented 9 years ago

I agree as well, which is why I'm having a hard time with treating the equivalent scenario in service workers totally opposite.

mfalken commented 9 years ago

To avoid the unsynced states problem, Blink now has object equality between JS objects representing the same Service Worker entity: crbug.com/459457. I believe this is spec conforming because the current spec neither requires nor disallows object equality, but it does require synced state. Accordingly, our tests only assert that state is equal between the objects.

annevk commented 9 years ago

We should put that in the specification though. Not defining an object's lifecycle in detail is a bug.

jungkees commented 9 years ago

I added [NewObject] / [SameObject] extended IDL attribute to attributes and methods: https://github.com/slightlyoff/ServiceWorker/commit/655ededa26925c898721a741d81a1b483a1d8a07. I think Gecko implementation still returns distinct objects in many places according to the earlier discussion in this thread whereas Blink implementation came with the object equality. Please review the above update. /cc @ehsan @wanderview @mattto @nikhilm

jungkees commented 9 years ago

I'm not entirely clear about the promise resolution values of the cache methods (not the promise objects themselves which are fresh objects):

CacheStorage.match().then(response => {});
CacheStorage.open().then(cache => {});
Cache.match().then(response => {});
Cache.matchAll().then(responses => {});
Cache.keys().then(requests => {});

IMO, the resolved values of the promise returned from the above methods should be the same object as well. WDYT? /cc @wanderview

To make it clear, here's the behavior for other objects gotten from the returned promises or attribute getters in the current spec:

navigator.serviceWorker.register().then(registration => {}); // same object
navigator.serviceWorker.getRegistration().then(registration => {}); // same object
navigator.serviceWorker.getRegistrations().then(registrations => {}); // same object
clients.matchAll().then(clients => {}); // same object
clients.openWindow().then(windowClient => {}); // distinct object
WindowClient.focus().then(windowClient => {}); // distinct object
(Extendable/ServiceWorker)MessageEvent.source; // same object
(Extendable/ServiceWorker)MessageEvent.ports; // same object
annevk commented 9 years ago

No, they should be fresh promise objects. And also mostly fresh Request/Response objects. Not sure about Cache objects.

jungkees commented 9 years ago

All the promise objects except the one returned from .ready are fresh promise objects. I'm talking about the objects that are resolved from those promises.

jungkees commented 9 years ago

Regarding the Request/Response objects, most of them are related with the cache API.

How about FetchEvent.request and FetchEvent.client? Within the same event object, shouldn't they refer to the same object?

annevk commented 9 years ago

FetchEvent members should be [SameObject], yes.

wanderview commented 9 years ago

IMO, the resolved values of the promise returned from the above methods should be the same object as well. WDYT? /cc @wanderview

All Request, Response, and Cache objects returned from Cache and CacheStorage should be new objects. This can't be represented in webidl as far as I know, though.

Reasoning:

I am ok with the self.caches CacheStorage object being [SameObject]. That one makes sense because its an attribute.

jungkees commented 9 years ago

Understood. Fair enough.

jungkees commented 9 years ago

Oh I didn't catch the following implementation issues: https://code.google.com/p/chromium/issues/detail?id=523904 https://bugzilla.mozilla.org/show_bug.cgi?id=1181039

So I think the following behavior seems right (the same object and the distinct object is for the promise resolution value):

navigator.serviceWorker.register().then(registration => {}); // same object
navigator.serviceWorker.getRegistration().then(registration => {}); // same object
navigator.serviceWorker.getRegistrations().then(registrations => {}); // same object
clients.matchAll().then(clients => {}); // ~~same object~~ distinct object ?
clients.openWindow().then(windowClient => {}); // distinct object
WindowClient.focus().then(windowClient => {}); // distinct object
(Extendable/ServiceWorker)MessageEvent.source; // same object
(Extendable/ServiceWorker)MessageEvent.ports; // same object

CacheStorage.match().then(response => {}); // distinct object
CacheStorage.open().then(cache => {}); // distinct object
Cache.match().then(response => {}); // distinct object
Cache.matchAll().then(responses => {}); // distinct object
Cache.keys().then(requests => {}); // distinct object

In the spec, when a distinct object is required, I used "new" explicitly. For the same objects, I did it with "an object that represents the target object." It seems better wording is necessary here to make it clearer? (any suggestion?)

/cc @nhiroki @mattto @wanderview @nikhilm @slightlyoff @jakearchibald @annevk

mfalken commented 9 years ago

Thanks for pinging this bug. Yes, better wording is needed; "an object" is ambiguous about whether it's the same or distinct. Saying "the object" instead may be sufficient.

As for the actual behavior, moving from same to distinct for the registration object made Blink/Chromium code a bit simpler, but I'm ok with requiring it to be same. It's actually probably needed to keep multiple registration objects in sync when listening for onupdatefound events (like statechange events for ServiceWorker: https://github.com/slightlyoff/ServiceWorker/issues/416#issuecomment-74889353)

jungkees commented 9 years ago

Changed it to "the object that represents the target object" expression for the same object case: 6643b11cdd52bf6732fe71e35ee1761569c50b82. Sorry for the confusion.

wanderview commented 9 years ago

Gecko bug to return same registration object: https://bugzilla.mozilla.org/show_bug.cgi?id=1201127

jakearchibald commented 8 years ago
clients.matchAll().then(clients => {}); // ~~same object~~ distinct object ?

Not sure what this means, but I think these should be distinct objects as they're snapshots. Agree with all the other distinct/same decisions.

jungkees commented 8 years ago

Just tried to put a strikeout on the "same object" and clarified it to the "distinct object".

clients.matchAll().then(clients => {}); // same object distinct object ?

The current spec resolves with distinct objects.

Thanks for the review! Closing.