Closed jarryd999 closed 2 weeks ago
The current Firefox/Gecko thinking here is:
So most directly addressing the issue, in such a hypothetical future:
Here is some common ground we found at TPAC 2019.
A compelling use case is supporting the scenario where the user enables offline, the application starts syncing gigabytes of contents to the user's device and, during the sync, the browser notices that the disk is running low on free space.
On devices that run multiple applications at the same time, the browser cannot easily reserve space for the application at the beginning of the sync. The browser can tell the application to stop syncing early, and can give this signal early enough that the application can stop gracefully. The alternative is that writes start to fail with quota exceeded errors.
The signal will be a quotachange
event on the StorageManager
interface. This works today, and will work in a world where storage buckets inherit from StorageManager
.
Still TBD is how available quota would change to reflect the fact that free disk space is running low. We can't expose the exact amount of free disk space to Web applications, because that would allow malicious apps to learn the size of cross-origin resources by writing them to Cache Storage.
@asutherland Please correct/flag anything I'm misremembering.
That's consistent with my understanding.
I think we also discussed that it might make sense to generate courtesy events to update origins on when their usage crosses certain thresholds so they don't need to poll. These would want to be de-bounced space-wise so that oscillating around a certain size doesn't generate a large number of events and time-wise so that timing side-channels aren't accidentally created. For example a random delay before hooking the event up to the idle timeout or something. The goal would be to avoid accidentally exposing implementation details, including things like GC which is made evident by the collection of the last in-memory handle to a disk-backed Blob/File, etc.
To explicitly state the primary problematic scenario I remember from the discussion:
Just for my own future reference, my general proposals for this area had been:
FWIW, I'm not a 100% on making quota per-bucket rather than keeping it per-origin. The primary use case I see for buckets is aiding eviction, be it through priorities/importance or by making things more granular for end users. It's not clear to me how useful it is that a bucket has a particular limit, although I suppose there are some use cases where this might help.
FWIW, I'm not a 100% on making quota per-bucket rather than keeping it per-origin.
One use case I'm aware of are sites that manage an opportunistic cache of resources in cache_storage. They implement their own eviction algorithm to keep the cache with the desired max size. Having some kind of bucket with a quota might help with this use case.
However, these sites also generally don't want to wipe just fail writes when they hit the limit or cause the entire bucket to get evicted. They want LRU eviction of some resources to reduce the size. Unless we add that kind of policy to buckets (which it seems we're unsure of) then maybe this use case is not really helped.
@annevk A recurring use-case in the Service Workers/Storage space is that a single origin may be divided up into sub-sites each handled by largely independent teams. If each team defines their own buckets and those buckets have their own quota, this helps the team reason about and control their storage usage. It also helps the browser apportion quota increases based on the sub-sites the user actually uses.
For example, imagine a site with a news feed that aggressively (pre)caches which the user barely uses, plus a photo album that opportunistically caches and which the user uses all the time. They each use their own bucket. If they share the same quota, the news feed might see there is spare quota and use up all the quota for its own purposes the user doesn't care about, while the photo caching bucket remains effectively the same size. Should the two sub-sites have to coordinate between themselves on how to allocate their quota, or should the browser be doing it for them via bucket quotas?
I don't think the browser has a good track record with managing quota so I guess I'd rather not expand our scope on that front. And also, the site teams will have to play by some rules anyway as otherwise the news feed folks would just relay that their bucket is vital infrastructure and cannot be wiped without wiping all.
has there been any progress on this? With AccessHandles(https://web.dev/file-system-access/#accessing-files-optimized-for-performance-from-the-origin-private-file-system) integration coming in chrome 99, and people using storage more heavily for larger and larger files... we really need some good indications of how much data we can store on the device, not just quota.
For instance: Replicating a Notes database locally to your client, it would be nice to show them how much free space they have so they know if it will fit. If we need an installed PWA for that particular use case, then fine... but the behavior when it is not installed should not leave the user experience broken.
we really need some good indications of how much data we can store on the device
We're actually moving in the opposite direction, making it harder to tell how much space is left on device, because revealing the amount of disk space remaining is privacy/fingerprinting risk.
As far as the Storage Pressure API, that never made it past prototype phase and we have recently removed the prototype code from Chromium, so we should probably close this issue.
We have been exploring the behavior around quota while under storage pressure: What value should we return for
quota
when the user has less available disk space than the quota we'd ordinarily return?The obvious approach is to return a shrunken value which is less than or equal to the available disk space. The issue with allowing script to identify the remaining disk space is that a bad actor could query the value before and after caching an opaque response and determine it's size. (See https://github.com/whatwg/storage/issues/31) Since we don't want to do this, the spec should have a recommendation.
One solution is to return the same quota, regardless of storage pressure. The tradeoff is that apps will lose insight into their own remaining space. This can be addressed by providing a storage pressure API that would let apps know whether there is more/less than X MB/GB.
Thoughts?