w3c / webextensions

Charter and administrivia for the WebExtensions Community Group (WECG)
Other
595 stars 56 forks source link

Proposal: StorageArea.onChanged filters #475

Open polywock opened 11 months ago

polywock commented 11 months ago

Problem

When monitoring storage changes via a service worker, all changes activate the worker, even if only a specific value matters.

Proposal

Allow StorageArea.onChanged.addListener to accept an optional StorageChangeFilter as an argument.

browser.storage.local.onChanged.addListener(callback, filter?: StorageChangeFilter)

// Usage example 
browser.storage.local.onChanged.addListener(handleChanges, {startsWith: ['flags:', 'global:']})

interface StorageChangeFilter {
    // positive parameters 
    startsWith?: string[], 
    contains?: string[], 
    equals?: string[], 

    // negative parameters
    doesNotStartWith?: string[], 
    doesNotContain?: string[], 
    doesNotEqual?: string[]
}
hanguokai commented 11 months ago

No matter what the API looks like, it is useful to support multiple namespaces in storage.StorageArea. Similar APIs like CacheStorage. But it is hard to say whether browsers are willing to support it, since it's only designed for simple use, rather than a full-featured database.

StorageChangeFilter is also useful for performance.

carlosjeurissen commented 11 months ago

@polywock When you change the actual storage, instead of adding a storage changed listener, can't you just send a message to the background script when a relevant storage has changed?

polywock commented 11 months ago

@carlosjeurissen That's a possibility I considered, but it will be inefficient and hacky for my use cases. It's almost like implementing your own StorageArea.onChanged system.

  1. My extension uses browser.storage frequently due to a lack of a persistent background page. My service worker cares about half the changes. That's a lot of messages being sent.
  2. The service worker won't know which keys changed just by the message. StorageArea.onChanged provides the changes with the old and new values.
  3. To be able to provide the changes in the message, you would need to StorageArea.get the old value, wait until you receive the old value and then StorageArea.set the new value. To avoid race issues, you might also need to await the set operation, and then send a message informing the service worker of the relevant changes. For my use case, this will have to be done very frequently.

The StorageChangeFilter proposal will be significantly more efficient and developer-friendly.

function handleChanges(changes) {
    // handle changes
}

browser.storage.local.onChanged.addListener(handleChanges, {startsWith: ['flags:', 'global:']})

I currently use the session StorageArea for keys I want the service worker to ignore, and the local StorageArea for keys that are relevant to the background service worker. This is an okay workaround, but some of the service worker relevant keys are more suited towards a session storage.

xeenon commented 11 months ago

Another option is to have StorageArea.onChanged.addListener have a filter parameter that allows watching only a specific set of keys for changes. That would wouldn't require adding the concept of spaces at all.

polywock commented 11 months ago

@xeenon I mentioned that in the very bottom of my original post. That's my backup proposal. (edit - I added header for it)

The filters should also have startsWith/contains. I have dynamic storage keys that include the tabId, which can't be predicted ahead of time.

xeenon commented 11 months ago

@polywock Awesome! Sorry I missed that. I think that is the approach I would prefer. It is backward compatible, at the risk of firing the event all the time in browsers that don't support the filter, but it won't break.

polywock commented 11 months ago

Based on the feedback from @hanguokai and @xeenon, the onChanged filters appear to be the more practical proposal. I've updated the title and original post to reflect this.

tophf commented 11 months ago

There's a discrepancy in the names of properties: either every matcher should be a verb (i.e. keys should be equals) or everything should be prefixed by key (i.e. keys, keyPrefixes, etc.).

Instead of prefixes and suffixes it'd be more helpful to use a subset of declarativeNetRequest's syntax for keyFilters: [] where | denotes the edge (start or end) and * is a glob e.g. |prefix, suffix|, anywhere, |exact|, |prefix*suffix|, etc. This would allow specifying a variety of conditions using just keys and keyFilters.

P.S. Filtering based on prefix would help Violentmonkey. We had to implement our own onChanged because our content scripts need to listen to specific changes in storage on demand, potentially in hundreds of tabs, so we couldn't rely on the browser's implementation.

erosman commented 11 months ago

Is there a performance benefit?

As mentioned by tophf, if filters are introduced, it would require a lot of different filters to cover all eventualities. It seems like a lot of extra processing, in comparison to the extension doing its own preferred filtering. Out of curiosity, what is the difference between these?

browser.storage.StorageArea.onChanged.addListener(handleChanges, {startsWith: ['flags:', 'global:']});
// storage change event
// check if filter exists
// apply filter
// send to extension
browser.storage.StorageArea.onChanged.addListener(handleChanges);
// storage change event
// send to extension

function handleChanges(changes) {
  // apply filter
  if (!changes.hasOwnProperty('xys')) { return; }
  // process changes
}
tophf commented 11 months ago

Without filters, the browser process will have to send a serialized message to every listener in all tabs, so if there are 100 tabs listening to onChange, then there will be 100 inter-process messages, which is expensive, especially if the stored object was big.

erosman commented 11 months ago

I see.... that makes sense in case of listeners registered in content scripts. The initial post seemed to focus on the background script though, where there would only be one.

I am not familiar with the process. Does the event get broadcasted globally, or does it get sent into each tab separately?

If it is a global event, then the number of tabs wouldn't matter. If it is sent to each tab, then having filters would mean keeping track of tab IDs and their filters.

tophf commented 11 months ago

Since the storage is handled in a separate browser process, even one inter-process message is still wasteful, especially if the object is big. Currently, AFAIK, there's no mechanism to broadcast one message to many processes for different sites, although technically there might be such a possibilty in the OS API, but I'm not sure all platforms have such a mechanism and it may require the use of shared memory, which would make it a possible vector for privilege escalation exploits.

polywock commented 11 months ago

@erosman:

The initial post seemed to focus on the background script though, where there would only be one....

I was more concerned about the lifecycle of the background service worker. The onChanged filters will provide developers with a method to optimize the service worker lifecycle, ensuring that it doesn’t initialize when it’s not necessary.

browser.storage.StorageArea.onChanged.addListener(handleChanges, {startsWith: ['flags:', 'global:']});
// Service worker only loaded (or lifespan extended) for relevant storage changes.
browser.storage.StorageArea.onChanged.addListener(handleChanges);
// Service worker loaded (or lifespan extended) for all storage changes.
polywock commented 9 months ago

@oliverdunk Since the filter types were a potential concern from the Chrome team, how about this revision?

interface StorageFilter {
    matches?: Glob[]
    excludeMatches?: Glob[]
} 

browser.storage.local.onChanged.addListener(handleChanges, {matches: ['flags:*', 'global:*']})
oliverdunk commented 9 months ago

@oliverdunk Since the filter types were a potential concern from the Chrome team, how about this revision?

interface StorageFilter {
    matches?: Glob[]
    excludeMatches?: Glob[]
} 

browser.storage.local.onChanged.addListener(handleChanges, {matches: ['flags:*', 'global:*']})

Thanks for the edit. I'll need to sync with the team (I'm still not sure how keen we are to add new filters in general short term) but the shorter list definitely looks better.