elves / elvish

Powerful scripting language & versatile interactive shell
https://elv.sh/
BSD 2-Clause "Simplified" License
5.67k stars 300 forks source link

`except` must be in the same line as `}` #987

Closed tfga closed 2 years ago

tfga commented 4 years ago

This code is ok:

try { 

    fail bad 

} except e {            # <= ok!

    put $e 
}

But if you move except to the next line

try { 

    fail bad 
}
except e {              # <= here

    put $e 
}

you get this very misterious error:

# compilation error: variable $e not found
# /home/tfga/sbin/exceptBug.elv, line 11:     put $e

Would it be possible to lift this restriction?

In any case, the error message is very puzzling. It took me a long time to figured out what exactly I had done wrong.

xiaq commented 4 years ago

Control flows in Elvish follow the same syntax as normal commands, so putting except on a separate line makes it a different command, no longer part of the try command.

Lifting this restriction will complicate Elvish's syntax a lot, but the error message should indeed be improved.

This can probably be done by always raising a compilation error if "except" is used as a command. Same thing can be done for "else" and "finally". This does prevent users from defining functions called these names, but calling a function else is almost certainly a bad idea anyway...

krader1961 commented 4 years ago

Note that in POSIX shells a common means of shooting yourself in the foot is to create a program named test that is in your PATH (probably because you put . in the list of directories). There was a lengthy discussion about how to deal with this in fish. Sadly, I don't recall how they decided to make it harder for people to shoot themselves in the foot in this manner.

I agree that the simplest solution is to disallow functions with the same name as keywords such as except, else, finally. I'd probably take it a step further and not resolve to an external command with those names unless an explicit e: prefix is used.

tfga commented 4 years ago

Afterwards, I realized that this problem is not specific to except: it also happens with e.g. if. The other day I wrote something like this:

if ?(id = (f))  { echo 'id =' $id }
else            { fail 'No connected devices' }

which doesn't work, for the same reason. I had to move the closing } to the next line:

if ?(id = (f))  { echo 'id =' $id
} else          { fail 'No connected devices' } [1]

-- which is horrible.

Idk, maybe the design of the parser should be rethought? Maybe this one-size-fits-all, everything-is-a-command strategy is not working and the parser should be more like a regular PL parser, with hard-coded knowledge about things like try and if.

(I'm guessing a lot of suff here. I haven't acctually looked at the code. Please correct -- and forgive -- me if I'm wrong).


[1] In this case, I'd even like to be able to do away with the curlies:

if ?(id = (f))  echo 'id =' $id
else            fail 'No connected devices'
hanche commented 4 years ago

@tfga I think the better way to format your code example is as follows:

if ?(id = (f)) {
  echo 'id =' $id
} else {
  fail 'No connected devices'
}
zzamboni commented 4 years ago

@tfga the reason for this behavior is documented here: https://elv.sh/learn/effective-elvish.html#code-blocks

This is because in Elvish, control structures like if follow the same syntax as normal commands, hence newlines terminate them. To make the code block part of the if command, it must appear on the same line.

To be honest with you, this behavior also caught me off-guard a few times at the beginning, but once you understand it, it's no different than syntax rules in any other language.

tfga commented 4 years ago

control structures like if follow the same syntax as normal commands

Exactly. What I'm saying is, IMHO, this is a bad idea. It might make sense from an implementation point of view, but it leads to bad UX.

krader1961 commented 4 years ago

I am of two minds on this issue. LIke @zzamboni I too was surprised, then annoyed, by this parsing behavior the first month I played with the language. So I can appreciate your perspective, @tfga. But a counter example is Python. It's indentation based block definition model is quite unlike anything from the Algol family of languages (which includes C/C++). Anyone, like myself, used to Algol style languages are likely to be annoyed by Python's syntax for the first few weeks. Yet it's one of my, and millions of others, favorite languages. Within two weeks I adjusted to Python's syntax rules. It took me only a little longer to become comfortable with the Elvish syntax rules.

As discussed earlier, it's probably better to just disallow, at compile time, functions/commands with the same name as keywords such as except, else, if or finally. Thus alerting an individual to the fact their syntax is wrong. If someone really needs to run an external command with the same name as a keyword they can use the e: namespace prefix.

Note that you can do elvish -compileonly /path/to/script to verify whether there are problems with the code. It's not clear whether a separate "lint" tool (similar to Go's go vet) is needed or warranted at this time.

tfga commented 4 years ago

But a counter example is Python. 

But even Python allows more flexibility than current Elvish.

A conditional, for instance, can be written in at least 3 different ways:

  1. Blocks have to be indented, on the next line:
    if cond:    
        return 1
    else:
        return 2
  2. Single statements can be placed on the same line:
    if cond : return 1
    else    : return 2
  3. And there's also an if expression (aka "the ternary operator"):
    return 1 if cond else 2
xiaq commented 4 years ago

Elvish is opinionated in its own way. I don't consider the fact that only one brace style is valid a shortcoming.

krader1961 commented 4 years ago

So I re-read this issue, again, and am still of the opinion that the only change that is uncontroversial is prohibiting the definition of Elvish functions with reserved keywords such as if, else, try and except. That wouldn't address the O.P.'s complaint but would make it harder for someone playing around to shoot themselves in the foot.

Addressing the O.P.'s complaint, without changing the grammar, would also require prohibiting external commands with the same name as a reserved keyword unless resolved using the external builtin or invoked with an explicit e: namespace prefix; e.g., e:except "I am an external command, not the Elvish except keyword". That solution may be worse than the problem it addresses.

krader1961 commented 4 years ago

This is a duplicate of issue #649 from 2.5 years ago. Both issues should be closed as "working as intended". I can't see a good reason to modify the syntax to allow consequent clauses to begin on a newline. Especially since the user can easily do so via line continuation if they feel strongly about that style being preferable:

> if ?(false) { put yes } ^
  else { put no }
no

That should be documented, probably as a FAQ, for the benefit of people who might not appreciate the consequences of https://elv.sh/learn/effective-elvish.html#code-blocks; assuming they have even read that text.

The unresolved questions are whether Elvish should:

1) Disallow defining functions with symbols, such as except and else, that are otherwise special when used in the expected context.

2) Require invoking the external command with one of those special symbols be explicit. That is by using a e: prefix or the external builtin.

Implementing the above makes it more obvious to new Elvish users that they have made a mistake when placing subordinate clauses on a new line with minimal consequences. Note that other special symbols such as if and try are already special-cased by the compiler.

tfga commented 4 years ago

if ?(false) { put yes } ^ else { put no }

With all due respect: don't you think we can do better than this?

This is not a solution: it's a workaround. ^ is "the parser workaround operator".

This is exactly the kind of quirk that makes people hate bash and that gives shell scripting a bad name. I thought the whole point of a project like elvish was to move away from things like that.

You went through all this trouble to implement all these high level constructs (exceptions, modules, closures) that make elvish a game-changer, but somehow keep insisting that syntax doesn't matter.

What we are asking for is not some revolutionary feature that has never been done before: it's just a basic syntactic convenience that can be found in every major PL.

@frantic1048

krader1961 commented 4 years ago

@tfga, I've been programming for a living since 1978 and have learned (and forgotten) a large number of programming languages. Yes, appeal to authority is not a good argument. :smile: My point is that every single programming language I have ever used annoyed me in multiple ways. Including some of my current favorites such as Python.

Even in the C language community, a language now 48 years old, there are significant differences of opinion regarding the "correct" way to format its source code and whether a particular change to its syntax should be made. Also, a "shell" language involves different trade-offs than a language like C, Go, or Python. I love Python but I would never use its interactive REPL mode as a command shell. Would you? If not, why not?

As a grey-beard who learned to program in C around 1985 I don't find the Elvish syntax (or "grammar") particularly troublesome. Even my somewhat ossified brain can manage to deal with this aspect of how the Elvish language differs from C and other Algol like languages. The key detail here is the implied ; (semicolon) at the end of a line that does not end with the ^ continuation character. See the code chunk discussion. Yes, that definitely needs a clearer definition of the term "code chunk". In particular, the significance of blocks defined by braces.

krader1961 commented 4 years ago

@tfga, See also this comment in issue #664.

zzamboni commented 4 years ago

@tfga I have to agree with others here. To be honest, this aspect of Elvish also bit me a few times at the beginning, but you get used to it and live with it. It's no different than syntax peculiarities or requirements in any other language.

In any case, personally I think that

if whatever {
   foo
} else {
  bar
}

is clearer and more readable than

if whatever { foo }
else { bar }

despite it being more compact.

tfga commented 4 years ago

@krader1961 @zzamboni Thanks for the replies.

See the code chunk discussion

@krader1961 Your link points to localhost 🙂. Here's the right one: https://elv.sh/ref/language.html#code-chunk

every single programming language I have ever used annoyed me in multiple ways.

Me too. Elvish already has quirks in other places, e.g.:

But these seem to have stronger reasons / be more difficult to solve (at least I don't have good solutions for them).

What's different here is that in this case I think the problem is avoidable.

The crux of the problem seems to be:

Special commands obey the same syntax rules as normal commands

Everything is parsed as a command first, then identified as "special" based on the first token.

What if this order was reversed?

  1. Get the next token
  2. Is it "special" (e.g. if)?
    • Yes => switch to if parser
    • No => switch to command parser

Then if wouldn't have to be a command anymore (i.e. wouldn't have to conform to the command syntax). Special commands would be able to have not only custom evaluation, but also custom syntax.

And I think this would be good for the parser in general. It would make it more flexible. It would open a whole new set of possibilities for the evolution of the language in the future.

tfga commented 4 years ago

Also, you would be able to produce better error messages. 😊

krader1961 commented 4 years ago

What if this order was reversed?

Only @xiaq is qualified to answer that question. I doubt anyone else has a sufficiently good understanding of the Elvish parser to provide a cogent answer why that is not a practical change. Prior comments by @xiaq on this, and other, issue(s) implies that is the case.

Ultimately, @tfga, it seems like what you are asking for is an alternative way to format Elvish source code that is still syntactically correct. I don't see any reason to support that. The Go language that inspires some aspects of Elvish is deliberately opinionated about how its source code should be formatted. Whether or not I agree with every aspect of the acceptable format (e.g., tabs versus spaces) I agree that there should be a single acceptable style for all syntactically significant aspects of the language. At least within a single project.

The only reason for changing the current behavior is to make it harder for Elvish users to make an understandable mistake. Or, at least, make any error messages related to that mistake easier to understand. Which is a worthwhile change but I question your motives is in using that as a justification.

krader1961 commented 4 years ago

@tfga, You also don't seem to recognize how the syntax interacts with a REPL environment. If I type if ?(whatever) { do-whatever } in an interactive REPL then press enter what should happen? Should Elvish wait for me to type else {....?

tfga commented 4 years ago

If I type if ?(whatever) { do-whatever } in an interactive REPL then press enter what should happen? Should Elvish wait for me to type else {....?

I really hadn't thought about this. 🤔 Off the top of my head, I would say... no. Because this is a complete statement by itself. Now, if the user had also typed else on the same line before pressing enter, then Elvish should wait for the else block.

xiaq commented 2 years ago

A belated response to the suggestion in https://github.com/elves/elvish/issues/987#issuecomment-696183723, it's definitely possible to accept else in a separate line but I don't feel it's worthwhile doing.

The experience of Go suggests that "forcing" people to use this style has more advantages over disadvantages.

The remaining TODOs are tracked in #649 now, so I'll close this.