w3c / FileAPI

File API
https://w3c.github.io/FileAPI/
Other
104 stars 44 forks source link

Clarify Lifetime of BlobStore #89

Closed bmeck closed 6 years ago

bmeck commented 6 years ago

I wrote up an example trying to figure out how long browsers allow you to load from a given Realm's BlobStore.

It creates a new URL inside of an <iframe> and then removes the <iframe> from the DOM and tries to use that URL to load a 2nd <iframe>.

The behavior seems to be inconsistent:

Browser Version Fires 2nd <iframe>'s onload Fires 2nd <iframe>'s onerror
Safari 11 No No
Firefox 52 No Yes
Chrome 61 Yes No

I think the spec is a bit iffy on timeline of when URLs become unavailable for load.

mkruisselbrink commented 6 years ago

Yeah, unfortunately the spec is a bit of a mess around the whole Blob URL Store concept (as you point out it sort of is per global, but then there are examples that show two different (same origin) windows "sharing" the same Blob URL Store).

That said, there is more subtlety going on here, and I'm not sure your example is entirely correct. First of all I'm not sure how you're detecting "post GC"? I don't think we really spec when a document is garbage collected, just what should happen when it happens. And then another issue with your code is that you seem to (synchronously) create and insert the new iframe after removing the old iframe. This insertion then should (also synchronously) resolve the blob URL (as part of parsing the src attribute), which should resolve the blob URL, and thus loading should generally succeed, even if removing the iframe actually would revoke the URL (since that involves GC and thus almost certainly isn't done synchronously).

So yes, lots of room for improvement with how the Blob URL Store is specified, and this is definitely something I hope to work on this quarter (since I'm also working on refactoring parts of the chrome implementation, and running into all kinds of edge cases that aren't clearly specified in the current spec).

annevk commented 6 years ago

So given that browsers allow users to open blob URLs in a new top-level tab, it seems to me that the store is effectively per browser and you can't have any narrower scope. We can place some restrictions here and there on usage, but as long as top-level tabs are game they all seem artificial to me (other than security restrictions of course).

mkruisselbrink commented 6 years ago

Yeah, there will definitely be some kind of per-user-agent component to the blob URL store. But also a per window/worker part (or at least each entry in the global store should know which global created it, for the spec that might be easier than having separate per-global stores in addition to the global per-browser store. So indeed I'll probably I'll just end up with a single per browser blob URL store, with each entry consisting of: URL, Blob and global that created the URL (which can also be used to infer the origin of the blob URL). On "destruction" of a global I'd then just iterate over all entries revoking everything for that global. Might have to be a bit careful to make sure that navigating from the window that created a blob URL to that blob URL still works, but that's probably okay as the blob URL gets resolved before the old global goes away.

annevk commented 6 years ago

The way you describe that sounds almost like it would make GC observable. If you want to do that kind of cleanup you'd have to define exactly when it happens.

mkruisselbrink commented 6 years ago

Not sure that quite counts as making GC observable. Currently is somewhat hand-wavily done as part of the unloading document cleanup steps. We're just missing the equivalent for workers.

annevk commented 6 years ago

Okay, that's a well-defined point and would not make GC observable. If it worked that way in practice navigation would work in theory I think as it has already grabbed a handle to the underlying object at URL parse time (again, specification theory).