domenic / proposal-blocks

Former home of a proposal for a new syntactic construct for serializable blocks of JavaScript code
215 stars 5 forks source link

why not just a directive or new type of function? #2

Open michaelficarra opened 6 years ago

michaelficarra commented 6 years ago

If blöcks are meant to be (syntactically) async function bodies which have a static check for free variables, why not just use async functions?

const result = await worker(async function(){
  "serialisable";
  const res = await fetch("people.json");
  const json = await res.json();

  return json[2].firstName;
});
mohsen1 commented 6 years ago

I much rather this! taking over {| |} and <> syntax is a bit too much for this specific use case

domenic commented 6 years ago

The question is whether the committee is willing to change how scope lookup works for { ... } braces and for functions. Past discussions have indicated no.

domenic commented 6 years ago

This also doesn't have an ergonomic way to do variable capturing, and is quite verbose, as noted in the readme under "alternatives considered".

rumkin commented 6 years ago

I'm agree with idea to make function serializable and transferrable. But also I'm agree with the issue author. Block is looking excess and unnecessary. If the problem is in verbosity of function ()..., then it would be proper to solve function declaration verbosity.

domenic commented 6 years ago

An important syntactic difference between blocks and functions is that blocks do not have parameter lists, which necessitate verbose workerFunction(async function (arg) { ... })(arg) notation for running code with a given value in scope. Instead the primitive is a block of code with an associated set of closed over bindings, denoted as worker<arg>{| ... |}>.

So I don't think trying to use functions to solve these use cases will go well. They are best done as separate constructs.

yornaath commented 6 years ago

Aesthetically kindo also like the "pipes" |. Very linuxy in the way of thinking of piping data in and out of a separate process.

michaelficarra commented 6 years ago

I disagree about the verbosity (we have async arrows after all, which are anything but verbose) and actually find the familiarity of regular functions with regular parameter lists a big benefit. Remember that new syntax is costly for at least two reasons: we have limited syntactic space available to us, and JavaScript programmers must learn it. Syntax is generally expected to be learned/used by all developers, while APIs are "optional" until one needs them. So let's try to avoid adding new syntax unless it's a truly unique and generally useful feature.

domenic commented 6 years ago

I contend that this is a truly unique and generally useful feature.

michaelficarra commented 6 years ago

I agree about its general usefulness. I don't think it's unique enough from async arrows, which are not significantly verbose.

domenic commented 6 years ago

I think the differences in variable lookup are unique enough that using function forms is not suitable. And I think creating a function and calling it is significantly different from stating a set of bindings that are associated with a block of code.

That said, I understand your point of view, and definitely look forward to the committee discussion.

bakkot commented 6 years ago

@domenic, I think changes to scope lookup of the form "you don't get to do scope lookup, and that's enforced statically" are a lot more likely to be palatable than "you get to do scope lookup, but it works differently". I don't think we've ever discussed the first option as distinct from the second, and I think there's a real chance the committee would find it acceptable.

felixfbecker commented 6 years ago

@domenic

I think the differences in variable lookup are unique enough that using function forms is not suitable.

Both functions and blocks in JS have access to the scope of outer variables (I can infinitely nest both blocks or IIFEs if I want). Are the differences between blöcks and blocks any bigger than a proposed isolated function and a normal function? I would say they are actually bigger, because when passing a function to worker(), I expect the worker() function to call it potentially async, while when wrapping code in a block somewhere, it is always executed immediately and synchronously. Additionally, functions can receive arguments (the values passed into the worker function, cloned), while blocks cannot.

jamiebuilds commented 6 years ago

This bothers me a lot:

{
  // executes immediately
}

{|
  // this does not
|}

But this makes a lot of sense to me:

let outer = 42;
let val = () => {|
  outer; // ReferenceError
  let inner = ${outer}; // reference
|};

It would also make a lot of sense to me that it continue working like a normal function:

let fn = () => {|...|};
fn();

But the function also has a well-known symbol on it:

val[Symbol.reify]();

However, using the ${captured} syntax, I think that they should work like template literals in that the reference is captured immediately (rather than passed in via reify({ name: ref }), but similar to interpolations in tagged template literals the [Symbol.reify]() method performs whatever serialisation is necessary.

let outer = { prop: 0 };
let fn = () => {|
  console.log(${outer}, ${outer.prop});
|};

fn(); // { prop: 0 }, 0
outer.prop = 1;
fn(); // { prop: 1 }, 0

let res = fn[Symbol.reify]();

execReified(res); // { prop: 1 }, 0
outer.prop = 2;
execReified(res); // { prop: 1 }, 0
domenic commented 6 years ago

Just as an update, this keeps coming up as the number one conceptual issue with this proposal. And I'm really struggling with what the right answer is.

To me, the "special type of function" arguments are on sound foundations. They make good arguments, and it seems like that approach composes better with the rest of the language. For example, it lets you serialize all different types of functions, not async ones.

The problem is, it's just so verbose. Compare the function-version:

const result = await worker(serializable async (endpoint) => {
  // Do stuff with endpoint
})(endpoint);

with the current proposal's

const result = await worker<endpoint>{|
  // Do stuff with endpoint
|};

The first contains a lot of redundancies:

Recall this proposal's motivating goal, of making it easy to do off-main-thread work in a fashion as lightweight as in shared-memory languages. It's unclear whether we can accomplish that with the first version.

(There's also the above-mentioned "is it OK to change the meaning of normal curly braces" question. This seems less important.)

michael-ciniawsky commented 6 years ago

Redundant "worker serializable async": compare to the second, where the implication is that worker blöcks are serializable and async, so we don't need to redundantly state that.

const result = await worker(async (args) => {
  const parser = await import('./parser.js')
  // No 'side-effects' (e.g globals) ✅  === serializable
  return parser.parse(args[0], args[1])
})(args)
const global = ...

const result = await worker(async (args) => {
  const parser = await import('./parser.js')
  // Has 'side-effects' (e.g globals) ❌
  return parser.parse(args[0], global)
})(args)

Can't the VM detect/analyse if a given {Function} is serializable or not ? What about introducing a new keyword e.g

// {Function} -> {Promise} not expected to be serializable (but possible (?))
async function name (args) {}
async (args) => {}

// {Function} -> {Promise} explicitly treated as serializable
thread function name (args) {}
thread (args) => {}
// or
serial[ize] function name (args) {}
serial[ize] (args) => {}

instead of introducting new syntax for arguments <args> and blocks {| |}. It's still a variation/type of {Function} as I understand it ?

michael-ciniawsky commented 6 years ago
const result = await worker<endpoint>{|
  // Do stuff with endpoint
|};

How would the implementation for worker look like in this example ? It's also honestly alienating that <arg> is equal to ((param) => {})(arg) for serializable {Function}s

domenic commented 6 years ago

@michael-ciniawsky the VM cannot analyze that. You'd introduce a new keyword. Which is what my whole post was about: comparing introduction of a new, verbose keyword, to a specialized syntax.

How would the implementation for worker look like in this example

See the readme of the repository you're commenting on. They're not equal.

michael-ciniawsky commented 6 years ago

the VM cannot analyze that.

"Code inside the blöck is parsed, and in doing so, the implementation checks that the code does not reference any bindings from outside the blöck."

What is implemenation refering to here ? That sounds like Scope Analysis and an {AsyncFunction} without bindings outside of it's body/block would be basically threadable ? The VM could execute said {AsyncFuntion} in a separate thread 'automatically' if threadable or 'normally' if it's not ? A thread keyword would make it explicit to avoid analysis and throw an {Error}. Anyways this kind of derails the discussion about the concrete proposal here...

const isThreadable = scope.analyze(fn)

isThreadable ? vm.threadPool(fn) : vm.main(fn)

See the readme of the repository you're commenting on. They're not equal.

kk my apologies here I believe I get it now :). Still I personally find the examples in the README confusing. I found the example worker ( in a <details><details> ?) block, now it make more sense :). I couldn't reason about what worker exactly is (e.g a new build-in (which doesn't make sense)), especially when I saw the e.g greenlet(fn) library example in comparison. Will it still be required to write that implementation, so blöcks really are just concerned to express a particular block of code is serializable ? I personally thought the purpose of a blöck is to make it possible for implementors to implement a ThreadPool where these blöcks will then be executed. I likely completely misunderstood the goal of this propsal, so again my apologies 😛

const blöck = {|
   // Code
|}

greenlet(asyncFn) === blöck

Update: I just saw #22 which answers my question(s)

felixfbecker commented 6 years ago

Imo there is a benefit to having an explicit keyword, which is that it is that a type checker or linter can give you feedback when you write the code that you are trying to access something that is not allowed within a worker function. If it just implicitly switches to a normal function, it couldn't do that.

If we find serializable too verbose, maybe we can think harder of shorter keywords, possibly abbreviated. What do other languages do here that have a similar feature? It could also be a special character, like * marks a generator function and not generator function() {}.

surma commented 6 years ago

I, too, feel that some sort of function is the right way to go, but I fully see the concerns voiced by @domenic, so I’m just going to dump a few thoughts in here:

michael-ciniawsky commented 6 years ago

Imo there is a benefit to having an explicit keyword, which is that it is that a type checker or linter can give you feedback when you write the code that you are trying to access something that is not allowed within a worker function. If it just implicitly switches to a normal function, it couldn't do that.

True

If we find serializable too verbose, maybe we can think harder of shorter keywords, possibly abbreviated. What do other languages do here that have a similar feature? It could also be a special character, like * marks a generator function and not generator function() {}.

E.g Lisp uses ' to qoute an expression without evaluating it

(defvar *evaluated* (+ 1 2))
(defvar *unevaluated* '(+ 1 2))

(logger 
   (*evaluated*)
)
// => 3
(logger 
   (*unevaluated*)
)
// => (+ 1 2)
function' name (args) {}
async function' name (args) {}

const fn = (args) =>' {}
const afn = async (args) =>' {}
function'
=>'

But ' is already used :(

j-f1 commented 6 years ago

function&?

michaelficarra commented 6 years ago

fünction

j-f1 commented 6 years ago

IMO we should stick to ASCII for keywords, since some software is not well-equipped to handle Unicode, and it’s often more difficiult to type, especially for those whose language doesn’t include “ü.”

felixfbecker commented 6 years ago

I strongly believe that <> as syntax to declare closed-over variables is confusing because a lot of languages use that syntax for generics, including TypeScript.

Here's an idea: PHP's closures always require explicit declaration of the variables you want to close over:

$message = 'hello';
$example = function () use ($message) {
    var_dump($message);
};

So what if we added the use keyword (or something similar), which would implicitly make the function an "isolated"/"serializable" function?

const message = 'hello'
const result = await worker(async () use (message) => {
  const parser = await import('./parser.js')
  return parser.parse(message)
})

The actual argument slots could be used for arguments worker() wants to provide to the worker function in the future, e.g. an AbortSignal.

ljharb commented 6 years ago

fwiw <> has an existing (albeit new) meaning in the community; it's how jsx declares fragments.

mkohlmyr commented 6 years ago

Personally, I think something like serializable () => {}, serial () => {}[1] or thread () => {} as mentioned above is much preferable to the pipe syntax. It really seems the correct model for how one would think about what it is (a special kind of function / block), and it matches existing expectations by being familiar to anyone who has seen async () => {}. In my opinion, even serializable async () => {} would be preferable to the pipe syntax, although as mentioned it may be an option for serializable to imply async.

In addition to being more consistent with existing features (async) it is also far more aesthetically pleasing (to me), and between async and whatever this keyword ends up being we'd have a consistent base for how similar features ought to be implemented in future.

[1] The only drawback here is that serial as a shorthand could be confused for meaning sequential, when in fact it should mean that the function can be executed in parallel.

jamiebuilds commented 6 years ago

Using a new block syntax {| ... |} as a function body with a marker syntax ${...} still seems like the best option:

let outer = 42;
let result = await worker(async () => {|
  let inner = ${outer};
  outer; // ReferenceError
  inner; // 42
  // ...
|});
michael-ciniawsky commented 6 years ago
const fn = serialize (args) => { /* transferable/serializable code */ }
serialize function fn (args) { /* transferable/serializable code */ }

Maybe..., but since this proposal is mainly and eventually 'only' (without special 'args' <> && 'tags sugar' tag{| |}) concerned about serializable blocks/bodies on the other hand the {| |} syntax is very convenient for that purpose alone compared to a new keyword

function fn (args) {| /* transferable/serializable code */  |}
serialize function fn (args) { /* transferable/serializable code */ }

thread should be avoided at all cost, since there is actually no threading envolved by solely using {| |}

benlesh commented 6 years ago

I also have concerns about <> because of TypeScript and JSX. I'm not sure that part of it is going to work out. Are there other syntaxes that could be substituted? I do like @jamiebuilds' idea.

felixfbecker commented 6 years ago

I think the {| ... |} to mean "isolated" (like walls around the function) is neat but the ${...} for referencing variables from the outer scope seems very weird to me. Just imagine if using that inside a template string. I would much rather explicitly declare the variables up front as part of the function signature like in the idea I proposed with use, or even just as function parameters (or a single parameter that is an array which could be destructured) that are passed through by the worker() function.

Maybe we can use {| ... |} to mark the "isolated" function, but use a keyword like use to declare the closed-over variables? The keyword could be optional if you don't need to close over any variables.

const message = 'hello'
const result = await worker(async () use (message) => {|
  const parser = await import('./parser.js')
  return parser.parse(message)
|})
const result = await worker(async () => {|
  const { work } = await import('./worker.js')
  return work()
|})

Alternatives to use are copy, clone or with. with already is already a keyword so that could be confusing, copy and clone I feel like talk to much about how it works behind the scenes (what if in the future engines are actually so smart they don't need to clone if the data is not mutated? E.g. an immutable arraybuffer or a frozen object). So I think use is the best, and it's used in other languages for that purpose.

suchipi commented 5 years ago

I notice that all the function-ish proposals are leaving the params of the "outer" function blank; I feel we could leverage those for the "passed bindings". There's a nice parallel with "functions take input and return output" and "blocks take input and return output", the only difference being that blocks are serializable and can't reference closures.

const block = async (message) => {|
  const parser = await import('./parser.js')
  return parser.parse(message)
|}

const result = await worker(block, "hello")

The user-defined worker function could take the arguments first, which might be more ergonomic:

const result = await worker(
  [1, 2, 3]
  async (...args) => {|
    return args.join(",")
  |}
)
// result is "1,2,3"

A block could even have call and apply methods on it if we wanted (instead of reify), which would make them "feel" like functions.

This API is similar to https://github.com/featurist/karma-server-side and https://github.com/suchipi/run-on-server.

suchipi commented 5 years ago

If blocks also supported being called with () syntax directly, then they could be used interchangeably with functions that don't need closure values, and the fact that they have no closure could help engines optimize them as compared to normal function calls, potentially.

dfabulich commented 5 years ago

I didn't see anybody mention this syntax:

const result = await worker({endpoint} => {
  // Use endpoint inside here
});

This avoids the <> syntax conflict with TS/Flow/JSX; it looks and feels like an arrow function accepting an enhanced object literal, which basically what it is.

It feels comfortable and familiar even with no arguments.

const result = await worker({} => {
  const res = await fetch("people.json");
  const json = await res.json();

  return json[2].firstName;
});

I could imagine someone insisting that async be explicit, e.g. await worker(async {} => blah) which would be fine, I guess, but not my favorite.

felixfbecker commented 5 years ago

@dfabulich imo that is too similar to destructuring parameters:

const result = await worker(({endpoint}) => {
  // Use endpoint inside here
});