fubark / cyber

Fast and concurrent scripting.
https://cyberscript.dev
MIT License
1.21k stars 43 forks source link

Rethink var declaration rules? #11

Closed icefoxen closed 1 year ago

icefoxen commented 1 year ago

Stumbled across the language website from lobste.rs and thought the language was pretty cool! One comment though, considering the test program:

func foo():
    a = 234
    print a

foo()
print(a)

This prints "234" "none", so a is a local variable being declared.

If you change it to:

a = 1
func foo():
    a = 234
    print a

foo()
print(a)

It prints "234" "234", so a is a variable from the enclosing scope, being assigned. There's two very different operations here that look identical, and differ only in context that may be 1000 lines of code away from where it matters.

This seems like a pretty good footgun, since it breaks locality. There's no way to look at func foo() alone and tell whether or not it changes state outside of it. Lua's "every variable is a global unless declared otherwise" has similar effects and is IMO a pretty good example of how much of a PITA this can be, so I humbly beg you to consider whether this is really what you want.

Thanks!

fubark commented 1 year ago

This is a tricky problem... it boils down to should writes to variables have the same name lookup as variable reads. For variable reads, I think it's pretty favorable to look upwards until a reference is found. It makes closures easier to write and easier to use global variables/functions.

So then for variable writes, if it favors locals first, it means you'd need extra syntax like python. I'm somewhat ok with that as I like the idea of not having an extra keyword for locals and not worrying about variables leaking. But it would require users to keep track of two different behaviors when dealing with variables. And what if you wanted to do a simple write in a closure, would that also need additional syntax?

Another twist to consider in Cyber is that functions have their own scope but subblocks within that function do not. So ifs, loops, they all share the same variable scope as the function.

icefoxen commented 1 year ago

For me it's more a matter of declaration and assignment having the same syntax. (Which is something I already prefer to avoid, 'cause I make typos.) That combines with assignment's context-sensitivity to make the surprising behavior possible.

For variable reads, this is fine because you can't sneakily screw something up by reading the wrong variable. Outside the current function, anyway. (Reading undeclared variables returning "none" is a different issue.)

So then for variable writes, if it favors locals first, it means you'd need extra syntax like python.

Not 100% sure, but I think you would need something like this anyway. One way or another you need to be able to tell it whether you are referring to something inside or outside of the local scope. Python has self., Ruby has its class var decorators, Lua has local decl's, etc.

fubark commented 1 year ago

I think you're right. Leaking writes is just far worse than leaking reads. Here's what I propose:

  1. To write to a static variable (declaration like var, func, import) from a local scope, you need to declare that variable with a static keyword.

    var a = 123
    func foo():
    -- or `static a = 123` 
    static a
    a = 234
    foo()
    print a       -- "234"
  2. To write to another local that is above your scope (captured var), you need to declare that with the upvar keyword.

    func foo():
    a = 123
    b = func():
    -- or `upvar a = 234`
    upvar a
    a = 234
    b()
    print a     -- "234"
    foo()
  3. let is removed. Assignments by default refer to the local scope. This makes more sense in Cyber's functions since assignments in subblocks refer to the same variable.

  4. Reads by default continue to work as they are looking upwards for a captured var or a static var. This stops being the case if the local variable has the static or upvar modifier.

ifreund commented 1 year ago

Here's another relevant example that currently has behavior I find extremely confusing:

x = 4
func foo():
    var x = 42
    print x

foo()

The output here is 4, but I don't think anyone reading this code without prior knowledge of the language would expect that.

With regards to your proposal to resolve this, I think that sounds very reasonable though I would propose replacing the keyword upvar with capture as I think capture a = 234 reads better.

I also agree with @icefoxen with regards to declaration and assignment using the same syntax being a source of bugs in general. Consider the following example which prints 42:

func foo():
    really_long_variable_name = 42
    -- do a lot of stuff
    really_long_varaible_name = 3
    print really_long_variable_name

foo()
fubark commented 1 year ago

I like capture. I wonder if there's away to do a simple declaration without a keyword. It can't use := since colons are pretty much reserved for the start of a block.

dumblob commented 1 year ago

Hm, this discussion reminds me a little of the way one needs to handle variables in Tcl. It is not the most user friendly nor safe way...

But I am not against any solutions in this space unless it would require variable shadowing effectively disallowing implementation of #26 .

fubark commented 1 year ago

One reason I'm conflicted with having a separate local declaration is that doing it in sub-blocks can also be confusing due to the property that every variable in a function belongs to the same scope. This property came about from the need to reduce the amount of refcounting and to utilize the same reserved virtual register as much as possible. So in this example, it would be allowed but it's confusing:

func foo():
    if true:
        let a = 123
    print a      -- "123"

Another reason is that I think baking a language construct to prevent a misspelling (in ifreund's case) is not as justifiable in a scripting language. This can be solved by an editor with autocomplete.

Thirdly, one of the goals in Cyber is to reduce the number of compile errors or any (stop what you're doing, you can't evaluate the rest of the code until this one thing is fixed).

fubark commented 1 year ago

This has been implemented. Please open another issue if you have concerns about the new declaration rules.