elves / elvish

Powerful scripting language & versatile interactive shell
https://elv.sh/
BSD 2-Clause "Simplified" License
5.67k stars 299 forks source link

Allow using dynamically built strings for external commands #764

Closed xiaq closed 4 years ago

xiaq commented 6 years ago

Background

Right now, Elvish always resolves commands at compile time, and it is not allowed to use a dynamically built string as a command. The compile-time resolution only accepts single string literals. For instance:

echo arg # OK: a string literal is resolved at compile-time
e'cho' arg # not OK: a compound expression, not a string literal

The rationale is that, references to functions must be resolved statically for lexical scoping to work correctly. If dynamically strings were allowed (which used to be the case), consider the following code:

fn f {
  fn g { echo 'g called' } # [1]
  fn h {
    x = g
    $x # [2]
  }
  h
}

The user would expect the call at [2] to refer to the g function previously defined at [1]. However, this is not feasible to implement; I am not going into details here, but ultimately it is because lexical scoping requires names to be resolved statically.

Problem

This rule has some unintended consequence. For instance, neither of the following is allowed:

~/bin/cmd arg # not OK: a compound expression of `~` + `/bin/echo`
./$dir/cmd arg # not OK: a compound expression

Instead an explicit external must be used:

(external ~/bin/cmd) arg
(external ./$dir/cmd) arg

Proposed Solution

Since building the paths of external commands is a pretty common use case, we can re-allow the use of dynamically built strings as commands, but always resolve them to external commands. However, doing this naively can have some other unintended consequences:

echo arg # static command; refers to the builtin function "echo"
e'cho' arg # dynamically built command; refers to the external command "echo"

A compromise is to only allow such dynamically built commands if they contain at least one slash or backslash. Since slashes and backslashes are not allowed in function names, this means that no ambiguity can arise. For instance:

e'cho' arg # Results in an error: dynamically built string cannot be used as commands unless they contain /
~/bin/command arg # OK: contains slashes
$dir'\command' arg # OK: contains a backslash, useful on Windows
krader1961 commented 6 years ago

Perhaps I'm drunk and missing the obvious. Why isn't the example involving nested functions handled using closures?

xiaq commented 6 years ago

@krader1961 Closures are precisely why the function names have to be resolved statically.

The runtime representation of closures do not carry all the outer lexical scopes of the closure; it only captures the variables that are referenced from the closure (upvalues). Deciding which names get captured as upvalues happens at compile-time; hence the compiler needs to know the command names statically. In the contrived example, the compiler can reason that $x always contains g, so the function g should be captured. But in the more general case, the compiler cannot.

Capturing all values in the outer lexical scopes is an option, but that is wildly inefficient and I don't think any language actually does that.