Open domenic opened 6 years ago
Yeah, I suspect what you want to do is whitelist the list of built-in globals that are expected to be present across all the contexts blöcks will be passable to.
(This would imply, however, that either (a) fetch
can't be on that list, or (b) you can't ever use a blöck in a worklet...)
Doing that, however, might be an issue with globals that have been overwritten/deleted/polyfilled/added (babelHelpers, shims, etc)
It's an "issue" in that I assume the blöck would automatically reference the primordial version, and pay no attention to mutations.
But yeah, then you can't polyfill in new functionality.
explicitly overriding globals could be done if you want to polyfill and i imagine with the current staticness of the syntax babel could pretty easily understand what needs to be changed to inject its own stuff in.
@devsnek it looks like it'd also require the wrapper function to be changed, and that might live in another file, so i think it'd be outside of babel's scope.
There should be some way to check if block has global references to manually resolve them before transfer. Each block should be packed on sender side and unpacked on a receiver side. It will make block behavior predictable, controllable and testable. And it will be possible to send blocks by network to execute in distributed environments without redistributing common code.
You could require fetch to be passed in as an argument, but that would get messy fast.
Alternatively, there could be some kind of import syntax for the global:
const result = await worker({|
import { fetch } from 'global';
const res = await fetch("people.json");
// …
|});
reify
could throw if the imports fail for that particular global.
I think something that requires explicitly passing in the globals loses ergonomics compared to the stringify-a-function-body versions. So I lean toward either removing the parsing errors or providing a way for the function (worker()
in this example) to state what bindings are always available.
if there was a way to combine the block with the function such that it’d be an error to create it using bindings that weren’t always available inside the worker, that might be a nice balance?
Not sure what you mean by "combine the block with the function", but let me be a bit more concrete with what I was thinking with regard to stating what bindings are available.
Notably, it would no longer have parse-time errors in any sense; they would be runtime errors. But, they would be runtime errors at the time of the call to worker()
, and not at the time of the call to reify()
, which is a bit nicer...
function worker(...) { /* as in readme */ }
worker.allowedBindings = ["fetch"];
// (1)
worker{| return fetch("foo.json"); |}
// (2)
worker{| return document.createElement("div"); |};
After seeing this block of code the parser has assembled information saying that the block at (1) has unbound fetch
and the block at (2) has unbound document
.
When line (1) executes, the first thing the runtime semantics for tagged blocks does is compare its saved list of unbound bindings with worker.allowedBindings
. It finds that the set is a subset, so it's good to go.
When line (2) executes, the same thing happens. But this time document
is not found in allowedBindings
so the call immediately throws without ever creating a blöck.
I'm not sure building this check in to the language is quite worth it. Especially since it's not a static check that tooling can detect in a safe way. (Tooling could use heuristics like "I know worker{||}
usually means the browser's built-in function, which has these allowed bindings". But tooling could also do that without us building this check into the language.)
I do think it's worthwhile to catch these sorts of errors as early as possible, and finding them at time of "calling worker()" is better than at time of reify(), or worse, at time of "undefined is not a function".
Another option: be strict about the "no captured variables" thing, including globals, but define this
in the context of the blöck to be the global object of the worker. So you'd write this.fetch()
instead of fetch
.
It's still less ergonomic, but you'd get static errors telling you if you did it wrong.
I think that's my favorite version yet. Probably not using "this" since that's too easy to shadow and I have a long-standing dislike for using it in non-method contexts, but just asking people to qualify their globals seems really solid.
The nice things about this
are that 1) it is already the global object in scripts, so there's precedent, and 2) it's already a binding which changes value depending on context, so there's less risk of it looking like it means something else (which would be a problem if we did something like let global = { fetch: ()=>{} }; await worker({| return global.fetch('foo'); |});
).
But other names could be made to work too, I expect.
You'll probably end up destructuring the global every time, but I guess that's ok:
await worker({|
let { fetch, Uint8Array, Map, Set, ReadableStream } = self;
// …
|});
However, doesn't that mean every imported library will have to do the same? I guess you'd need a build step to rewrite every global reference to prepend self.
or add the appropriate destructuring.
I quite like the destructuring though because it is clearer about what parts of the worker's API are being accessed. I don't necessarily think that could be tedious. It really is up to the developer how complex a blöck can/should get and how many parts of the worker API a certain blöck uses anyway.
I'm not really worried about it being tedious, I'm more worried about the ease of using existing libraries.
Eg, if a library I import uses new Uint8Array(500)
, it'll fail. I'll have to modify the library to use new self.Uint8Array(500)
, or let { Uint8Array } = self
at the start of its module.
Imported libraries don't have their bindings checked; only the code syntactically inside the blöck is checked. So imported libraries would not need to change.
I guess that's kinda obvious now you say it 🙃
Is it essential to statically catch these errors as early errors? Maybe if the Worker is run in a mode that will forward the ReferenceError back up to the spawn-er in a visible way, it would be enough. Also, type systems and maybe advanced linters could catch these errors.
I think removing the early errors is an option, but it decreases the utility.
ping @erights
It's 2019 now, and the obvious thing to do would be to only allow globalThis
as an implicit global variable.
await worker({|
let { fetch, Uint8Array, Map, Set, ReadableStream } = globalThis;
// …
|});
That seems fine.
In another issue on this repo I just suggested an alternate syntax:
const result = await worker({endpoint} => {
const res = await globalThis.fetch(endpoint);
const json = await res.json();
return json[2].firstName;
});
Yeah. That feels about right.
The proposal says that
But the very simplest example violates this:
The marked line references the
fetch
binding, which happens to resolve to a global variable.The intent here is really that we use the worker's global
fetch()
. But I'm not sure how to make this rigorous, while still keeping the desirable property that non-globals need to be explicitly captured, and if not will cause parse failures.The simplest fix I can think of is to abandon the parse errors on bad bindings :(. But that gets rid of one of the main advantages.
A more complicated fix would be that maybe
worker
can declare a list of allowed implicit bindings, and then we'd set it to basically all the globals present in worker global scopes? Seems rough.Not sure what to do here.