w3c / FileAPI

File API
https://w3c.github.io/FileAPI/
Other
105 stars 44 forks source link

Async createObjectURL #84

Closed bmeck closed 6 years ago

bmeck commented 7 years ago

createObjectURL is unable to generate circular dependencies for ECMAScript Modules due to URLs requiring their content be given at time of creation. This leads to a problem when trying to recreate this file using createObjectURL:

// a.mjs
import './a.mjs';

It seems that there is no way to generate the URL for the import statement since it is returned from createObjectURL.

I would propose there be an async form or controller for this. I am not tied to any given API but can imagine something like:

// bikeshed method name, doesn't matter to me
const url = URL.createAsyncURL((async () => {
  await; // make sure `url` exists by waiting a tick
  return new Blob([`import ${url};`], {type: 'text/javascript'});
})());
mkruisselbrink commented 7 years ago

I'm not sure I understand why it would be useful to be able to create a URL that references content that contains that same URL?

bmeck commented 7 years ago

@mkruisselbrink for situations where modules have circular dependencies. This can be used for situations where you are instrumenting a source text with code coverage for example.

bmeck commented 7 years ago

I should clarify,

// ./a.mjs
import './b.mjs'
// ./b.mjs
import './a.mjs'

Has the same problem of being unable to generate the URLs in a circular manner.

annevk commented 7 years ago

The problem isn't clear enough to me to be able to recommend any particular solution here.

bmeck commented 7 years ago

@annevk I have a working example of the problem in https://github.com/bmeck/browser-hooking/blob/adc8e6767fd6147b662258c765f7923b28105aed/entry.mjs#L3 , I have a Web Worker rewriting ES Modules, which may have circular dependencies. I am not using service worker due to some problems with first load not being able to be integrated against. In particular the worker cannot resolve circular dependencies since the URL generated by URL.createObjectURL cannot be reserved ahead of populating the body.

annevk commented 7 years ago

It seems service workers would be the natural solution here. Creating blob URL strings which are not immediately backed by a Blob object has all sorts of problems. E.g., the URL parser cannot immediately serialize the associated Blob in that case.

bmeck commented 7 years ago

@annevk 2 points:

  1. ServiceWorkers do not work on initial loading of pages, making them appear to be a poor choice for hooking data.

the URL parser cannot immediately serialize the associated Blob in that case.

When does the Blob content access happen? I don't see it in https://w3c.github.io/FileAPI/#unicodeBlobURL

annevk commented 7 years ago

See https://url.spec.whatwg.org/#url-parsing.

(Given that we might be able to get origin policies before loading something from a domain I suspect that if that's successful we might also pursue loading a service worker before the resource too.)

bmeck commented 7 years ago

@annevk that seems it just needs a flag for pending blob body rather than assuming the existence of a URL implies a body.

annevk commented 7 years ago

How would that work with synchronous XMLHttpRequest? I don't think it would be as simple a change as you assert.

bmeck commented 7 years ago

@annevk

  1. does it need to / isn't that deprecated?
  2. Why not an InvalidAccessError like the conditions in https://xhr.spec.whatwg.org/#the-open()-method
annevk commented 7 years ago

What I'm saying is that you need to go through all the cases that expect URL's object to carry a Blob object and update them with language that they now need to wait for it or throw. And also do so while avoiding race conditions, ensure it's all tested, etc. It's a rather non-trivial addition.

bmeck commented 7 years ago

@annevk do you think races could be avoided with a service worker that does work on initial page load?

annevk commented 7 years ago

I don't really see similar issues there and it's also not further enshrining this extremely awkward setup where a string keeps an object alive.

bmeck commented 7 years ago

@annevk can't we already make a string keep objects alive in more simple ways?

const ns = await import('data:text/javascript,export default {}');
assert(typeof ns.default === 'object');

I don't find the argument compelling when we have another way to do exactly that.

annevk commented 7 years ago

I don't think I'm familiar enough with modules to see the similarity.

bmeck commented 7 years ago

@annevk Module Map in HTML / ECMA262 is Idempotent. If you import a Data URL like the one above, it lives for the life of the Realm of JS. In the example above it is a simple {} that gets allocated and can never be GC'd. Unlike URL.createObjectURL there is no way to revoke an entry from the Module Map. A user could allocate more complex things or even just make variable storage that can never be GC'd in a similar way that is matched up to a string.

annevk commented 7 years ago

Lifetime of the realm is less long than blob URLs I'm pretty sure (although that's still not defined in detail unfortunately).

bmeck commented 7 years ago

@annevk correct, blob stores appear to be per tab in browsers rather than per realm. They even cross workers.

bmeck commented 7 years ago

Ah, I should note that I am using the feature of crossing realms on purpose in my example above.

bmeck commented 6 years ago

@annevk made https://github.com/w3c/FileAPI/issues/89 while looking into lifetime of URLs

bmeck commented 6 years ago

Closing in favor of https://github.com/w3c/FileAPI/issues/97