tc39 / proposal-do-expressions

Proposal for `do` expressions
MIT License
1.12k stars 14 forks source link

Use explicit `return` #20

Closed kylejlin closed 6 years ago

kylejlin commented 6 years ago

Problem

The do expression is very ambiguous.

First off, I can't find anywhere in the README what the do block actually does. It takes a series of statements and expressions, and then returns... something. What does it actually "return" (evaluate to)?

Solution

Remember good ol' IIFEs? Well, it turns out they are a good ES5 polyfill for do blocks, even if they are more verbose:

// Edgy, sexy, modern ES:
const a = do {
  const tmp = f()
  tmp * tmp + 1
}

// Old, stable, browser-supported ES5:
var a = (function() {
  var tmp = f();
  return tmp * tmp + 1;
})();

If you notice, there is absolutely no ambiguity with the IIFE! Why? Because it has an explicit return statement! Since do blocks are just syntactic sugar for IIFEs, we should require an explicit return statement, just like IIFEs do. To clarify:

// Current syntax:
const a = do {
  tmp = f()
  tmp * tmp + 1
}

// Proposed syntax:
const a = do {
  tmp = f()
  return tmp * tmp + 1
}

🎉🎉🎉

Hooray! All ambiguity issues are solved! Now you can finally close all those issues asking "how does do behave in if/switch/for/while/etc...".

EDIT: To clarify, the proposed concept/pseudo-spec asserts that:

do {
  <body>
}

is equivalent to

(function() {
  <body>
})()

and therefore

a = do {
  return 'foo'
}

is equivalent to:

a = (function() {
  return 'foo'
})()

The return inside a do behaves exactly like the return inside an IIFE, because do is an IIFE.

ljharb commented 6 years ago

That would create new ambiguity issues:

function foo() {
  do {
    return true;
  }
  return false;
}

What does foo() return?

kylejlin commented 6 years ago

On the basis of the presumption that do is syntactic sugar for an IIFE, then:

function foo() {
  do {
    return true;
  }
  return false;
}

is syntactic sugar for

function foo() {
  (function() {
    return true;
  })();
  return false;
}

Therefore, foo() returns false.

It appears my explanation for the proposal was incomplete: Since do is just a nice-looking IIFE, the return inside the do applies to the do itself, not the outer function. That is, the return inside the do specifies the return value of the do, not the return value of the outer function.

By this, I mean:

function foo() {
  do {
    return true; // return value of the do-expression
  }
  return false; // return value of foo()
}

Apologies for the miscommunication. To clarify, my proposal is:

do {
  <body>
}

transpiles to

(function() {
  <body>
})()
ljharb commented 6 years ago

do isn't just a nice-looking IIFE tho. I understand your suggested transpilation/sugar, but that doesn't seem like what it's intended to be.

kylejlin commented 6 years ago

In that case, I'll close this.

eloytoro commented 6 years ago

do isn't just a nice-looking IIFE tho.

I could argue that it is. After all the IIFE is a hack that is often used for the same purpose but it has some semantic differences.

Main differences

Without the use of return and instead evaluating the do expression to the last evaluated expression will include a big number of possible footguns to the language as shown in many examples in #21 (statements evaluating to undefined for instance)

This approach is extremely unnatural in the JS world and should be reconsidered

ljharb commented 6 years ago

Having an explicit marker for the final value is a separate discussion, i think, from the incorrect classification of a do expression as an IIFE (consider: inner var decls, arguments/this/super, strict mode pragma, stack traces, etc).

js-choi commented 6 years ago

A significant difference between do expressions and IIFEs is that they may contain await, yield, and yield *, which work on the outer function context transparently from within block statements but not within IIFEs. In fact, this is one large flaw in Babel’s current implementation of do expressions. See also babel/babel#3780.

eloytoro commented 6 years ago

Having an explicit marker for the final value is a separate discussion

The title of the thread reads "Use explicit return"

from the incorrect classification of a do expression as an IIFE

The IIFE works as an example, not as an complete homomorphism of the semantics. The initial post (correct me if im wrong) tries to show how to resolve the ambiguity of the use of do expression by using the return keyword inside of it's block to declare termination, then it uses IIFE as an example of how this would transpile for backwards compat.

If you check popular transpilers nowadays the const and let keywords are transpiled into var, but they're not the same as var, same comparison goes for do and IIFEs

ljharb commented 6 years ago

@eloytoro yes, but using “return” is different than using “something else”, because since do expressions are not at all IIFEs, using “return” imo is a nonstarter.

kylejlin commented 6 years ago

From what I'm reading so far, it seems like most of you agree that:

  1. IIFEs are not the same as do expressions (but there are many similarities and common use cases, though that's beside the point)
  2. An explicit terminator of some kind (not necessarily return) would greatly improve readability.

Let's cast aside the former for now and focus on the latter.

An explicit terminator

It seems like most us are in favor of an explicit terminator, though exactly what the terminator should be is debatable. Let's consider our possiblities:

1. break

This initially seems okay. This would be the idea:

do {
  valueIWantToTheDoExpressionToEvaluateTo;
  break;
}

The only problem that comes to mind is that it conflicts with the breaking ability of other blocks (e.g., loops). Consider this:

// Yeah, I know, Array.prototype.indexOf(). Just stop thinking about practicality.

const array = [/*blah blah blah*/];
const element = blahBlahBlah();

const indexOfItemInArray = do {
  for (var i = 0; i < array.length; i++) {
    if (array[i] === element) {
      i;
      break; // <-- Problem: this break will break the for-loop, not the do-block
    }
  }
  -1;
};

As you can see, the do break could conflict with the break of other structures. In the above example, indexOfItemInArray will always be -1.

2. return

Like the break idea, this is almost perfect, but once again, it conflicts with an existing feature: the return of functions. That is:

const foo = do {
  function someInnerFunc() {
    // Oh no, I can't terminate the do-expression, because calling 'return' will just terminate this function, which is not what I want!
  }
};

// The same applies vice-versa
function bar() {
  const x = do {
    // Oh no, I can't make bar() return, because calling 'return' will just terminate this do-block, which is not what I want!
  }
}

3. break <value>

Borrowing from the idea of return <value>;, we could use break <value>;. To solve the conflict with loop breaks, the rule could be:

If there is no value after the break (i.e., break;), it terminates the loop, and if there is a value after it (i.e., break <value>;), it terminates the do expression.

Now our DIY indexOf() will actually work:

const array = [/*blah blah blah*/];
const element = blahBlahBlah();

const indexOfItemInArray = do {
  for (var i = 0; i < array.length; i++) {
    if (array[i] === element) {
      break i; // Since there is a value, it breaks the do, not the for.
    }
  }
  break -1;
};

There is one issue I can think of off the top of my head: backwards compatibility complications. Though nobody uses it anymore (in fact, I don't know if anybody ever used this feature), there was a feature in JavaScript called labeled blocks (probably inspired from Java, but that's besides the point). I don't think this is common knowledge, so here's a description:

labeledBlock: {
  // Any statements can go here
  break labeledBlock; // You can break a specific block by using its label
}

// This comes in handy for breaking outer blocks from within inner blocks
outerBlock: {
  innerBlock: {
    break outerBlock;
  }
  alert('This message will never be shown, as this code will never be reached.');
}

The benefit of this feature is that you can break labeled blocks even if you are in a deeper block, which is a problem we encountered with ideas 1 and 2 (break and return).

The backwards compatibility issue is that the compiler may mistake break <value> with break <label>. For example, what happens here:

foo: {
  const foo = getSomeValue()
  const bar = do {
    break foo; // What's being broken? Is the label being broken, or is do block being terminated with the constant foo as its output?
  }
}

Since few people use labeled blocks, and giving your block label and variable the same name is a bad idea anyway, this shouldn't be too much of an issue.

break do <value>

This is the most verbose, but it is also the most expressive. Writing break do clearly indicates an intent to terminate the do block, not some other block. The only drawback is the mild verbosity (all that trouble for a simple do expression? Really?), but that's not too bad in my opinion.

Conclusion(ish)

I chose order in which I presented the terminator options to you for a reason, that being so you could hopefully understand my thought process. I'm not certain which option is the best, but so far I'm leaning toward break do simply because of its expressiveness (its semantic function is literally in the name). I'm eager to hear your take on this, and we will probably disagree, but at least now you understand my reasoning.

Addendum

There is one other big possibility I forgot: completely new syntax. There are a bunch of unused characters/character combinations to choose from, but I feel like none of these are a good option, because it would be steepen the learning curve excessively. While somebody unfamiliar with the do block could at least sort of deduce what the other terminator options mean, (especially the ultra-expressive break do), if they saw some new operator-looking-thing in an expression that was already foreign to them, that would probably give them a headache. For example:

// Pretend the new terminator is '->'
const foo = do {
  const x = f()
  x++
  -> x * 5 // ????? Developer is thinking: What the heck is this '->' crap?
}
pitaj commented 6 years ago

An explicit terminator of some kind (not necessarily return) would greatly improve readability.

I just want to say that I do not at all agree with this. In fact, I believe it defeats the purpose of do expressions if required to return a value.

eloytoro commented 6 years ago

I believe it defeats the purpose of do expressions if required to return a value.

I'd like to hear more about that, in my view, if anything, its the complete opposite, after all do expressions will evaluate to a value, so it feels natural to imperatively declare which expression to eval to, right?

I think that the idea of evaling to the last expression will force developers to start thinking about what statements eval to in the language, which is something most people don't bother with, which adds unnecessary complexity

@kylejlin not sure what youre aiming at with those examples, they're all versions of the same thing, also return inside do statements are not ambiguous, its as if you were confused by a function inside of another function, thinking that somehow the parser would screw up and mix it up with the enclosing function.

maybe this has to be taken to a different issue, as the discussion is more about introducing (or not) a mechanism to immediately resolve do expressions to a value.

kylejlin commented 6 years ago

@eloytoro

its as if you were confused by a function inside of another function, thinking that somehow the parser would screw up and mix it up with the enclosing function.

You're right. As I pointed out, there should be zero ambiguity with the do-return idea, just like there's no ambiguity with this:

function myFunction() {
  const foo = (function doExprPolyfill() {
    return 'stuff' // Obviously refers to doExprPolyfill
  })()
}

Playing devil's advocate:

... a function inside of another function

Technically, a do block inside of a function is not the same as "a function inside of another function," because the do is not quite a function. Though I personally believe a do is very close to an IIFE, many others disagree.

Traditionally, returns are scoped exclusively to functions, not blocks (i.e., you can break an enclosing function from within an if/for/while/etc block). So if somebody believes a do block is not at all a (immediately invoked) function, then in their eyes, scoping return to do would be inconsistent with the rest of the language, because the return would not be function-scoped.

Back to the my perspective:

Personally, since I think of the do block as a pseudo-IIFE (yes, I am aware there are some semantic differences), I think an exception to function-scoped return tradition can be made, since do is so close to a (immediately invoked) function. If an exception can indeed be made, then I am all for using return. The only reason I posted some alternatives was in case some people considered return a nonstarter.

Thom1729 commented 6 years ago

I believe it defeats the purpose of do expressions if required to return a value.

I would also like to hear more about this. In my opinion, the expression reads much more clearly with an explicit return. To me, the implicit return is a net minus even before we get into the corner cases. But I can't argue against the contrary position, because I don't understand it.

What is "the purpose of do expressions"? It sounds like there's a sort of "sweet spot" use case where you feel that return would get in the way. If that's the case, then I would like to see it. Concrete examples might help me to understand where you're coming from here.

As a longtime JavaScript developer, the IIFE-like story makes intuitive sense to me. That's not to say that a do-expression really "is" an IIFE, or that there are no differences, but that mirroring the function return convention would be the least surprising thing to do. In another language, that might not be the case -- for instance, in Ruby, where implicit returns were baked into the language from the beginning.

In addition, at the expense of wandering outside the scope of this proposal, using return would generalize very nicely to generator do-expressions:

// No need for async arrow functions!
const wrapGenerator = (start, end) => generator => do* {
  yield start;
  yield* generator;
  yield end;
};

A generator do-expression might contain a return as well. Surely we wouldn't want to mix implicit returns in with explicit yields, right?

Jack-Works commented 3 years ago

image

I have added inlay hints support to show all return points in the editor (https://github.com/microsoft/TypeScript/pull/42437)