Open mo-xiaoming opened 1 year ago
There are other possibilities for it to go wrong
let
a = (2, 3);
(x, a) = a; // rebind a
in
print(a); // outputs 3
and
let
a = {x=3, y=4};
{x=p, y=a} = a; // rebind a
in
print(a); // outputs 4
Being a Haskell-alike lang, those behaviors should not be allowed.
But fixing them takes more work
Let bindings are allowed to shadow variables in the most functional of all FPs: However, there's a subtle (known) bug in pattern bindings that you may want to investigate further
But in the examples I gave, they are in the same let clause let a = 3; a = 4 ...
not nested ones like let a = 3 in let a = 4...
. The latter is clearly name shadowing (keyword let
and different scopes), the former one is rather confusing to say the least
Don't let do
notations or let block
déclarations fool you, they're eventually rewritten as a series of nested let
expressions.
do { a = 1; a = 2; print(a); }
is morally equivalent to let a = 1 in let a = 2 in let _ = print(a) in ()
Note, the variable _
can be the subject of multiple let bindings:
do { print(2); print (3); }
becomes let _ = print(2) in let _ = print(3) in ()
. There's nothing wrong in this
Don't let
do
notations orlet block
déclarations fool you That's my beef with it, it fooled more than me
I was aware of this rewriting to nested let
behavior, after going through yacc file. However, we cannot and should not expect other people to understand this by reading parser code
do { a = 3; a = 4; }
looks like a reassign or rebind, and it turns out to be neither.
Based on the principle of least astonishment, we have some options,
Hobbes has no notion of rebind
, so I'm not sure where you're pulling this from. The doc only mentions local variable introductions, which is what is happening here. My guess is, those getting fooled by this notation are expecting the semantics of the language to be the same as that of C++. This is wrong, and people should read the doc of the language instead of making assumptions on how it's supposed to work.
Does the following also look illegitimate to you? do { a = 1; (do { a = 2; print a;}); print a; }
? It's no different that than the notations you're trying to block in the above. It is perfectly legit to have different environments reusing the same name. They are in different environments after all. In imperative languages, they call these blocks
, and use {
... }
to delimit them. In Hobbes, we'd use (
... )
as shown above ^. This is done in this way, because there are no statements
in hobbes, only expressions that evaluate to values (same as in Haskell, Rust, ...).
Also, note that your current change addresses this issue only from the perspective of the parser. There are more ways (especially using the C++ bindings) one can build let
bindings and come with local definitions that'd bind the same name in nested environments as in the above. RPC can also be used to build these very same let
bindings. In a sense, this change doesn't work.
One more thing, the title of the ticket implies that there's a notion of "the same scope on do or let" in Hobbes. This is not true and it is misleading as, otherwise, these scopes would be somehow destructive. Each introduction of a new binding creates a totally new scope, different from the previous one. It's described in the doc in the section "local variable definitions". It may be that the doc isn't clear enough, so it should be improved (do
notations aren't documented, so they need to be added).
I did mention earlier a more subtle (serious bug) issue in hobbes with the handling of pattern bindings
, I actually did open a few tickets, here's one of them:
The pattern matching engine is a critical piece of Hobbes, and the above 2 issues are the closest to what you're trying to fix here. I believe that, fixing them will also address the let (x,a) = a...
issue you pointed at earlier
Hobbes has no notion of
rebind
, so I'm not sure where you're pulling this from. That's the problem, in C++/Java/Rust/Haskell, a{}
defines a scope,{ a= x; a = y; }
is a rebind/redefine/reassign/re..., doesn't matter language itself allows it or not. I put this in title because it looks like name rebind in the same scope (I was aware of nested scopes get created at the time, but I still chose this very title because that's what it looks like)
But how could a hobbes programmer expect names in a simple {}
creates nested blocks? Hobbes has different rules, names in {}
creates nested scope, not rebinding/re..., doesn't matter how much it looks like, that's creates surprises, surprises are bad unless for a really really good reason. in C++/C/Java, same name cannot be reused in {}
, but it is ok in { { } }
, Haskell is no exception. (Rust allows its, but hobbes is not rust-alike)
If there's a good reason to be so, then it should be better documented. If not, then this behavior should be changed
As for the Local Variable section in doc. It uses an let
example, we understand whatever names defines in let
are only valid till the end of let .. in
expression, because it has the same behavior as Haskell
However, the problem we're talking about is more like this, still in REPL
> a = 3
> a = 4
stdin:1,5-5: Variable already defined: a
I know REPL works differently, but people are expecting "If it looks like a duck, swims like a duck, and quacks like a duck, then it probably a duck"
How do we expect others to spend time on understanding this subtlety for a simple =
?
Hobbes has no notion of
rebind
, so I'm not sure where you're pulling this from. That's the problem, in C++/Java/Rust/Haskell, a{}
defines a scope,{ a= x; a = y; }
is a rebind/redefine/reassign/re..., doesn't matter language itself allows it or not. I put this in title because it looks like name rebind in the same scope (I was aware of nested scopes get created at the time, but I still chose this very title because that's what it looks like)But how could a hobbes programmer expect names in a simple
{}
creates nested blocks? Hobbes has different rules, names in{}
creates nested scope, not rebinding/re..., doesn't matter how much it looks like, that's creates surprises, surprises are bad unless for a really really good reason. in C++/C/Java, same name cannot be reused in{}
, but it is ok in{ { } }
, Haskell is no exception. (Rust allows its, but hobbes is not rust-alike)If there's a good reason to be so, then it should be better documented. If not, then this behavior should be changed
What you say about Haskell isn't true! {...}
in haskell delimits a syntactic block. Curly braces introduce no new variable bindings at all, let alone, they don't even define a scope in Haskell. So I'm unsure what makes you say "Haskell is no exception".
Hobbes is inspired from Haskell, not C, not Java, and not C++. Again, it's important to read documentation, and not make assumption that everything that uses curly braces quacks, then it's [like] C++.
As for the Local Variable section in doc. It uses an
let
example, we understand whatever names defines inlet
are only valid till the end oflet .. in
expression, because it has the same behavior as Haskell
Hopefully you do! And if so, then it should be easy to explain what happens in the REPL. When you let a = 1 in a + 1
, you are introducing a new variable a
and binding it to the value 1
. At the end of the let
expression, you can't use a
as a replacement for 1
anymore. In a sense, the scope
has starting point and an ending point with let
expressions.
When in the REPL, what would that scope
be? When you first write a = 3
in the repl, you are really saying "From now on, a is 3". In Hobbes REPL, this is true for the rest of your REPL session (that is, until you shut it down). Haskell GHCi allows rebinding a
to a different value, but Hobbes doesn't. Two design choices that work, and I can understand both sides.
However, the problem we're talking about is more like this, still in REPL
> a = 3 > a = 4 stdin:1,5-5: Variable already defined: a
I know REPL works differently, but people are expecting "If it looks like a duck, swims like a duck, and quacks like a duck, then it probably a duck"
How do we expect others to spend time on understanding this subtlety for a simple
=
?
Document it in simpler words that =
is not assignment
at all. It's equivalent to a proposition. They should read a = 1
as variable a binds to the value 1
, or a is 1
. Saying a is 1
and a is 2
is absurd, at least in propositional logic.
If you want to assign a new value, then you have the <-
operator. The following should then work (if you disable checks for unsafe calls):
a = newPrim() :: int
a <- 2
a <- 3
foo = \_ -> let a = 3
a = 4
in a + a
reports
<source>:1:17: error:
Conflicting definitions for `a'
Bound at: <source>:1:17
<source>:2:17
|
1 | foo = \_ -> let a = 3
| ^^^^^...
in Haskell. Even for people have c++/c/java background expect this behavior
In hobbes,
foo = \_.let a = 3;
a = 4;
in a + a
foo()
gives 8
I might argue there are some similarities in those code snippets, but surprisingly differently behaviors.
I'm not saying hobbes IS haskell, despite multiple places in documentation mentions haskell. I'm saying hobbes should have the same behavior as haskell
But at least, we should limits language surprises as much as possible. Just like you said
Saying a is 1 and a is 2 is absurd, at least in propositional logic.
Then at least, we should let compiler panic and tell programmer his/her code probably has logic error, error message can be something like
name conflicts, rebinding not allowed in hobbes
1. if you want to reassign, please use <- instead of =
2. otherwise, please use different names to avoid confusion
The following works in Haskell, and that's the simplest behavior for let that Hobbes is also implementing
ghci> foo = \_ -> let a = 1 in let a = 4 in a + a
ghci> foo ()
8
Like I said earlier, Hobbes rewrite all let
expressions in the above form, and there's nothing wrong with this. Haskell doesn't always do that, as it also supports let
constructs for various cases (monadic, comprehensions, let blocks, ... all of them are different things in Haskell).
There are various ways to implement let
expressions, and some of them are even used in monadic operations. Hobbes obviously doesn't support all of them (we don't have real monads there); it's a matter of choice:
Then at least, we should let compiler panic and tell programmer his/her code probably has logic error, error message can be something like
I believe you didn't follow my previous explanation; that's exactly what the compiler complained about when you tried it in the REPL:
> a = 3
> a = 4
stdin:1,5-5: Variable already defined: a
It's a matter of scope definition, that's the point I was making earlier. In a REPL, that scope is the whole session. In a compiled program, that scope is delimited by let
expressions, so there's no possible conflict.
I'm not saying hobbes IS haskell, despite multiple places in documentation mentions haskell. I'm saying hobbes should have the same behavior as haskell
Haskell allows the following, which Hobbes doesn't allow. Your whole point is that we shouldn't allow binding variable names 2x in let expression; and Haskell does allow this in certain cases. And you want Hobbes to have the same behavior as Haskell, which is to allow binding variable names to new values as in Haskell.
ghci> let a = 1
ghci> let a = 4
ghci> a + a
8
The above is plain Haskell. I don't follow you anymore.
If you are so concerned about the possibility for Java developers to get confused about the above, then the issue is that they don't understand the rewrite rules around let blocks. Remove that rewrite rule around syntactic sugars such as do
and series of let
. Then let (pun unintended) these devs write ground expressions with the simpler syntax
let x = value in expr
. No more do
blocks, no more let
blocks in Hobbes. I have to warn you, these syntactic sugars are there for a reason: that is to increase expressiveness for Hobbes developers.
Let me set the record straight on what I've been talking about
let a =3; a =4;...
to let a0=3; let a1=4...
internally=
Everything boils down to, in your words, a=3; a=4
is "absurd", but compiler silently accept that code, is this a fact?
My whole point from my first post is simple: Should the compiler be more helpful that it should tell programmers that theirs code makes no sense?
Let me set the record straight on what I've been talking about
- I've never said we should stop rewriting code from
let a =3; a =4;...
tolet a0=3; let a1=4...
internally- What I said is, this rewriting has corner cases confused programmers, doesn't matter they got this kind of code by accidently copy-past, forgot what's going on in the beginning of this block, or even genuinely believe this code is legit
- The two examples I gave in my previous post, they are not from REPL. Like I mentioned above, I do know hobbes REPL behaves differently , so it should not be used for comparison
Why not? the principles of let evaluations stay the same if you understand they always evaluate in a context of variable assignments. These principles are the same whether you're in the REPL or in a compiled program. I strongly recommend you read about propositional logic and their evaluation judgments if you haven't already, because this is no different.
- Normal programmers do not nor should not care about the complier design internals for a simple
=
This is highly subjective. What's a normal programmer? A Java developer? A C++ dev? or one who reads about the semantics of a symbol such as =
before applying it?
If that's not clear,then it is important you document/remind to your users the semantics of =
and clarify that it's not the same as the one in C++ or Java (with both having changing semantics for =
. There, it's either declaration or assignment).
In Hobbes,
<-
=
In C++,
=
=
It's okay to be different. Hobbes is not a C++ replacement language, and it serves different purposes.
Everything boils down to, in your words,
a=3; a=4
is "absurd", but compiler silently accept that code, is this a fact?
You are ignoring the context (the scope), so this isn't 100% correct. This is absurd if you consider a single scope (as is done in the REPL), but it is not absurd if each binding introduces different scope (as that's the case in let
evaluation for compiled programs).
One can't make a statement, show it's True for a single instance, then consider it a proof of it being universally True like you do here.
My whole point from my first post is simple: Should the compiler be more helpful that it should tell programmers that theirs code makes no sense?
Good point, but remember: the example you are flagging as being universally
absurd is no different than
do { a = 3; (do { a = 4; print a;}); print a; }
. Would you call this absurd
too?
How about the following? Is this absurd to you as well?
mo.C ✕ +
1 auto main () -> int {
1 │ auto a = 3;
2 │ │ a = 4;
3 │ return a+a;
4 }
I'm surprised to see your response to the word "absurd", that makes me realized that I might misunderstand what you've said
It's equivalent to a proposition. They should read a = 1 as variable a binds to the value 1, or a is 1. Saying a is 1 and a is 2 is absurd, at least in propositional logic.
say, in hobbes code, when you see something like
let
...
a = ...
...
a = ...
in ...
Current compiler is happy to compile this code, but are there any better(correct?) ways to do whatever the programmer wants it do?
From your previous post, I thought your answer is Yes, there are better ways to fulfill his/her purpose
Seems I got your answer wrong. So it really depends on your answer to this, then we might have a totally different discussion
My standing on this PR is simple, it is all about ergonomics
=
supports to be simple, IMO, it is unreasonable to expect programmers to read through papers and thesis before pressing the =
keya=...;a=...;
in hobbes, I believe there might be better/clearer ways to do whatever the author of this code wants to do. Being fit in theory is less important. Let the compiler to be programmers' friend is more importantLet's say we have a c++ code like this
struct X {
T foo() const;
};
Any C++ compilers will be happy to compile this code, because it is legit C++. but clang-tidy still warns me that it might be a good idea to add [[nodiscard]]
, because it is a pure function, without using its return value, I probably just heat up cpus for no reason, I either forget to remove some dead code, or there are logic errors buried deep in my code. And C++ is infamous for lacking of proper tools as you're well aware
We write bugs with perfect grammar, tools can help
I couldn't figure out what kind of functionalities in reality can only be achieved by a=...;a=...
, so I consider programmers are making mistakes when they write code like this
What you want is a way for the compiler to check for potential name shadowing and flag them for you. You don't want to prevent advanced users from writing these expressions. You want a way to lint your programs, so the compiler can warn you on potential name shadowing issues. Make it an option, then choose a default that works for the largest audience.
Guess what? We do have support for Optional flags and the ability to extend the compiler with new flags. @dawa79 (Dan Wang) can walk you through how he added the -o no-Safe
option extension to Hobbes, and you could do something similar.
There's a macro expansion
stage in the compiler pipeline, where you can run your checks. Even better, you could run your linter at the same place as where Dan wrote the checks for unsafe
expressions.
Don't ever change the parser though, for no reasons. You hardly ever need to change the parser, except for compelling cases. See the parser as the contract Hobbes has with its users. You never want to break this contract.
I couldn't figure out what kind of functionalities in reality can only be achieved by a=...;a=..., so I consider programmers are making mistakes when they write code like this
I'm lazy and find it hard to invent new variable names each time I need one. Seriously, I don't think variable names are an infinite resource; they are not. The compiler is the buddy that figures names out for me, so I can focus on engineering solutions for harder problems.
Isn't that the whole point about Lambda calculus? Names don't really matter... But I'm digressing, I'll stop it here now.
for the moment, following code compiles
This behavior is weird for a functional programming lang
(edited) and for
let
expressionThis should be allowed neither