Closed bar-g closed 8 months ago
A proposed idea to fix this "same syntax is used for wildly different semantic" problem in python: https://github.com/oilshell/oil/issues/1796
Not sure what to do here, it now seems to me it's working as inteded, but it's a very surprising not-consistent, not-obvious behavior.
Maybe we can discuss the ideas at https://github.com/oilshell/oil/issues/1796 and I can close here with a sensible conclusion.
Yes so this is a conflict between
I think @Melkor333 brought this issue up as well
My answer then is if that if you stick to the "proc subset"` of YSH, then you don't have this issue
var myarray = :|one two three|
myproc @myarray # splicing copies the arguments
But if you do
myproc (myarray)
then you are passing a reference to a mutable container, and then you have this potentially unexpected behavior.
There is a difference between
List and Dict
As of Oils 0.20.0, we will actually print these differently! List and Dict get an address, to show you its a container
ysh ysh-0.19.0$ = []
(List 0x7fed5548e8c0) [] # <= NOTE ADDRESS HERE
ysh ysh-0.19.0$ = 42
(Int) 42
The issue is rebinding vs. mutating
value.Place
lets you rebind an Int
or a List
But to mutate a List
, you don't need to use value.Place
. You just pass the list itself, and then you mutate it with
call mylist->append('foo')
setvar mydict.key = 'zzz'
etc.
Possibly we could allow &mylist
as a no-op? Not sure
Possibly we could allow &mylist as a no-op? Not sure
That sounds like a good part of a solution to make the different behaviour of rebinding/mutating vs. copying when passing or assigning a variable apparent! Let's see if I get you right:
Being able to specify at the definition-site what behavior a proc/func expects seems like a good idea to me.
So, for example in
proc blank-var (&x)
the &
would mean to require that passed arguments will always be mutated "in-place".
x
will affect the outer variableAnd as a consequence of specifying the &
in the definition, all calls of this proc would be required to also specify the &
sigel:
var mydict
blank-var (&mydict)
I thought about it a little more, I think any no-op should be separate from &x
to avoid confusiong
&x
is for rebinding / re-assigning the name x&&x
or +x
could be to "let you know" we're going to mutate this containerI was thinking +x
because it's already a no-op on integers. Although it's confusing because +'55'
may already do something
(Hm actually I just noticed +x
doesn't crashes! Need to fix it)
Anyway you could imagine something like
var mydict
clear-dict (mydict)
clear-dict (+mydict) # same thing
clear-dict (&&mydict) # another syntax
rebind-name (&mydict) # this is different
Hm now that I think about it, +x
breaks a "language design principle"
https://github.com/oilshell/oil/wiki/Language-Design-Principles
Even though I never use +x
in an arithmetic context, it is valid in JavaScript and C as type conversion to integer
So we probably shouldn't use that
%x
available, or &&x
kinda makes sense*x
would break the principle since it looks close to Python splat~x
is overloading an operator, probably not^x
is similar to a quotation, probably notI find &&
a bit noisy but maybe it's OK
Oh actually !x
is possible!! In Ruby and Lisp that sometimes means "mutate", so it could be useful
clear-dict (x)
clear-dict (!x)
However in Lisp it's part of the name
clear-dict! (x)
So actually that could be another thing -- we could allow !
in names
var x = clearDict!(x)
clear-dict! (x)
I think any no-op should be separate from &x to avoid confusiong
&x is for rebinding / re-assigning the name x &&x or +x could be to "let you know" we're going to mutate this container
I'm not sure why or where you think what could be confused as what. Maybe I don't understand yet what the rebind-name proc actually does in the example.
rebind-name (&mydict) # this is different
Do you mean mydict is put in a place?
I noticed and you wrote something like value.Place can hold Dict, but is it actually neccesary to allow mutables in value.Place? If there is no need, couldn't value.Place only allow immutable "atoms" so that & would always either mean mutable container or place. (And & be a noop in rebind-name (&mydict)
, if mydict actually is of type Dict.)
1) I'm not sure yet if I completely understood you, but if we consider the topic of assingments, e.g.
setout &foo_value = # to mutate container or place ('&...' required), setvalue could be an alias of it for in-func/proc and top-level usage
setvar foo = # to assign immutable (atom) to the variable
setvar foo =& # to assign a mutable container or place
setvar foo =* # to explicitly assign whatever type, atom, mutable container or place
Then maybe that "additional rebind/re-assignment?" of the dict in
rebind-name (&mydict)
could be
rebind-name (=&mydict)
?
setvar foo =& # to assign a mutable container or place
I notice setvar foo =& &myinteger
, here we would kind of get your double &, to get the same as what &myinteger would do implicitly when specyfied in a proc call?
I noticed and you wrote something like value.Place can hold Dict, but is it actually neccesary to allow mutables in value.Place? If there is no need, couldn't value.Place only allow immutable "atoms" so that & would always either mean mutable container or place.
One way to explain the difference
var mydict = {}
rebind-name-to-a different-object (&mydict)
$ = mydict
(Str) 'this can be a string now'
versus
var mydict
mutate-dict-in-place (mydict) # note no &
$ = mydict
(Dict) {'key': 'this MUST be a Dict, CANNOT be a string'}
So it doesn't really make sense to say that &x
is a Dict or whatever, it's a name that can refer to ANY type
x
is a Dict, but &x
is a Place -- NOT a place for a dict, it's a place for any value
And of course we actually use this
json read (&x) # x can be Dict if the message is {}, List if it's [], etc.
we do NOT know ahead of time what the type of x
is -- and &x
a Place, full stop
I personally think it's best to hide the pointer concept as much as possible and therefore I don't like &
that much. It also doesnt feel very wrong for me that dicts/arrays are passed by reference.
But both is probably because of my python background. So I don't think my input is very valuable here 😅
Yeah I don't think value.Place is common in most code, but we kinda need it for read (&x)
and json read (&x)
Though note you can just omit it, and you use the default _reply
magic variable
I'm leaning pretty hard toward just borrowing Ruby and Lisp, and allowing
mutate-dict! (x)
call mutateDict!(x)
in the name. It's relatively well known, and easy to implement
It's just a convention, there's no enforcement, like in Ruby/Lisp
It doesn't introduce anything that's not in another language
First, thank you very much for your patience, I think I got it, now that I see the "json read (&x)" can not know the type it needs to set.
The mutate-dict! (x)
, json! read (x)
? or json read! (x)
? syntax, would it only work for one param?, all?, or just a return value?
By now I also understand all this isn't that new and surprising with a python background. (For me, with the current state of affairs, it was quite "shocking", though, actually convinced having hit very fundamental behavior inconsistency bugs in ysh, before starting to figure this out, while almost everything has been working so nice with osh before. :-). Personally, I think it took me considerably more than a week, so quite a bummer if everyone migrating from shell has to go through this.
I think what called for this trouble were the hidden and surprising pointers/dereferencing/mutables, and I hope there can be some good defaults to let a consistent syntax grow up naturally, in a simple and understandable way.
So here is my updated attempt to bring together all the loose ends that I could get a hold of:
[EDIT: thoughts concerning finding the pointer assignment operator]
I can also see some similarity to the 1>&2 file descriptor assignments in shell "set 1 to where 2 is set".
The added >
here resembles a bit more of the pointer/referencing meaning.
So re-using that, 'setvar foo =& ...` could be compared to:
setvar foo >& bar
setvar foo =>& bar # a full arrow for visual understanding
# but that's cryptic and => allone is very similar to the other operators, so we're back to `=->`
setvar foo =-> bar
Hm, going though my compilation of affected things again, it seems !x
could actually very well work in place of &x
also.
I think only setout
may allow to effortlessly set an outer variable to be a new place (compared to ->setValue()
).
[overcome stuff removed]
So that would be:
[Corrected]
setout &passed_var =& &other_var
or
setout !passed_var =! !other_var
(if =! is not too similar to !=, and not too bad as asignment operator)
setout !passed_var =-> !other_var
(pretty solid in uniqness, intuitiveness, and looks)
[Overall, !
seems to make a good impression as prefix, even in the most complicated case of creating a new outside place for some variable, best if combined with a distinct more "pointer-like" assignment operator.]
Ok, I produced an overview again, and put it up in the issue description.
If you waited, I think it has settled now (three variants for the default).
(Have a look at the new overview maintained in the description: https://github.com/oilshell/oil/issues/1793)
Note on Assignment Operators
The connotation of !
as dereference!, follow!, or mutating! fits very well when it's used as prefix for!some_var
.
However, the connotations don't fit that well when !
is used in a "pointer assignment" operator (seldomly used), and would only make it easy to confuse it with prefixes that may still follow on the right hand side in special cases.
And most importantly: Using =!
as "pointer assignment" operator would falsely associate it with places, which is wrong, because it is also used and needed to make new and existing variables point to mutable containers.
So, I settled (back again) for =->
as generic "pointer assignment".
The only places I currently see where variables would have to be required to be prefixed in order to implement a 100% consistent variable interaction behavior seem to be:
setout
/setvalue
assignments.I think there may be no need to require prefixes on the right hand side of assignments, because there, the behavior would already be apparent by the =->
assignment operator. Actually, adding a !
-prefix on the right hand side can properly mean to create a place (or no-op if type is already a place).
Well, as a place can take all types, &
(or !
) could be the universal and only commonly used one.
I guess the :
(or whatever prefix for mutable[containers]) would only have to be used in order to explicitly disable complete variable re-assignments and to disable changing the type from within a proc/func.
[Requiring indicating prefixes] On the left hand side of setout/setvalue assignments.
That might be relaxable, if it's also supported to read name[indexes] from Place type, as if they were mutables, transparently (if they refer to mutables). (https://github.com/oilshell/oil/issues/1794)
Hm, since there is one universal place-prefix that will usually be used, and it's not nice if the seldomly needed "no-rebind" variant is completely different...
I now adapted the overview to use the "mutable container" identifier only in addition, i.e. behind the usual place identifier: &:out
or (if switching away from the current choice) !:out
The overview seems to gravitate towards pretty consistent, nice and just few unobtrusive syntax requirements!
Found a further nice simplification: using the !-...
prefix for the "without type-changes" case (i.e. mutables only), with -
being just a rare special case "modifier" to !
. (Issue overview is reworked and includes a table to check.)
What do you think about making the Place type non-nestable and transparent for "index[notation]", just as it already is the case with mutable containers? (https://github.com/oilshell/oil/issues/1794)
That seems to be quite a requirement for smooth and consistent Variable<->Type<->Behavior interaction (usability) that's based on a universal Place, serving as the default "fully-featured" reference type (allowing for re-bind/ type changes), even for mutable containers by-default.
It's really a mess with no solution in python, other than putting all the burden of the inconsistencies on the users (https://stackoverflow.com/questions/986006/how-do-i-pass-a-variable-by-reference), but ysh has already implemented the data-type part (universal Place type), so can fix this for good.
(https://github.com/oilshell/oil/wiki/Language-Design-Principles) if our syntax looks like JavaScript or Python, it should behave like JavaScript or Python, unless we're fixing a wart.
Isn't this a wart on the nose of languages, if they omit a small amount of syntax exactness which would maintain fully consistent behavior. (And thus obsolete a lot of head scratching and justification "theory" in learning and teaching.)
What do you think about making the Place type non-nestable, and transparent for "index[notation]" so that it works just as with mutable containers directly? (https://github.com/oilshell/oil/issues/1794)
var a =-> { key: 'value' }
setvar b =-> a # ok, done two clearly indicated pointer assignments
setvar b.key = 'changed' # all regular mutation continues to be done with simple syntax
echo $[a.key] # shows 'changed' (same value read through pointer variable 'a', as indicated)
func setStart( !dict ) { setout dict.start = 'set' } # no ambiguity (not even if re-assigning dict)
call setStart( !a ) # clearly mutating 'a'
var value = 1.01 # floats are "immutable atoms", but syntax stays 100% consistent:
proc plus1(; !num ) { setout num += 1 }
plus1 ( !value )
echo $value # shows 2.01, outer value referenced and changed as indicated
Hm =->
is pretty weird, no language I know of has that. It also can make the interpreter less efficient to have to test which operators are used for what.
I agree that shell users are going to be confused a bit by the new mutable containers, but
Basically you have to learn this new rule to get additional power ... BTW people sometimes call it aliasing -- two names that refer to the same value. Shell doesn't really have that idea.
i.e. There is one dict here, not two. a and b are names for the same value; in other words they are aliases.
var a = {k: 'val'}
var b = a
I think that we will allow mutate-dict!
because Ruby and Lisp have it
I think that mutate-dict! (!d)
is possible, with !d
as a no-op, though it's slightly redundant. It seems like you want to put the !
in one place or another, not both places
Hi, thanks for checking this out,
=->
is pretty weird, no language I know of has that. It also can make the interpreter less efficient to have to test which operators are used for what.
Isn't there already a check now whether to create a immutable copy or just a pointer/alias?
Hm, but I'd say all those q&a pages about the mutable type stuff (with many actually confused answers) and separate gotcha pages actually show what is a weird language shortcoming of not having something like strict_typeinteract
, by default.
(Again, I think python et.al. may not be able to fix this wart right away, but ysh can, thanks to the universal Place type.
Couldn't it be a static parsing check, at least for distinguishing "immutable atom" vs. universal Place !...
in assignments definitions and calls?
I think the dynamic combination of not-rebindable/reassignable, i.e. combining !-...
and =*
behavior will be much less needed, only for some specific corner cases, if at all. (So, maybe for this case it's enough to print a warning instead of implementing dynamic checks only for this case.)
I think that mutate-dict! (!d) is possible, with !d as a no-op, though it's slightly redundant. It seems like you want to put the ! in one place or another, not both places
Hm, maybe like this: If the name ends with ...!
then all defined typed params (and their defaults?) are considered as defined with !...
(Place), and all args passed in calls are implicitly converted into type Place? (So it's not necessary to mention the !
on individual args in calls or defined params, but they would not do any harm.)
That might also become possible in a straight forward way, if places are not nestable, i.e !
being a no-op on the Place type itself, and if the container[index] syntax works transparently on Places as well, just as it works on referencing mutable type "alias" variables.
Oh, there is also a new case of assigning places:
var a = place # 'a' is independent new place, initially pointing to the same
var a =-> place # 'a' points to same place
Would it be ok to pick only one that makes sense, you think?, i.e. only allow the second one, to never create any possibility of double indirection?
So, a check may be necessary in any case.
Hm, there are more consequences of strict_typeinteract
.
Actually, I think only having the consistent difference in syntax is what would also allow:
var a = dict # 'a' is an independent new dict (copy of dict)
var a =-> dict # 'a' points to same dict
So, the first line's syntax could actually do an implicit copy (efficient dupe) if given a mutable container.
And in retrospect, I realize that from looking at the current proc/func signatures and code, one can not tell at all how they behave. This is contrary to the impression that the proc/func guide https://www.oilshell.org/release/0.19.0/doc/proc-func.html gave me when reading it beforehand.
Currently, it's not possible to reason about procs/funcs just by looking at their signature. One needs to know the type of each variable, to really know if they can have outside effects or not. And also the code within procs/funcs doesn't tell it all, it's just the same setvar
everywhere.
Hm, after distancing some days from https://github.com/oilshell/oil/issues/1793#issuecomment-1907380297, can you maybe recognize in parts a reasoning bias you may have encountered yourself at some time, when bringing up some shell pitfall and possible fixes? I mean after one got really used to something it is to some degree well understandable, to underestimate the problem, and tending to see problems in a solution, rather.
For example, a parser, here having to parse two different operators (one more) and the execution to check for the operator after having had to check for mutable type anyway, is that a real efficiency issue?
Or, ysh being a more powerful language, what does that have to do with a consistent syntax, it's power would not be reduced at all with consistent and apparent syntax. The same power may become easier to reason about, though, and present and express itself much more naturally based on syntax differences, instead of being solely based on backgound knowledge. (And as a nice side-effect, the syntax can even solve the "mutable defaults" pitfall.)
For experienced python users, what would be new? Creating pointer assignments with setvar and others is a rare thing, it does clean up the initial declarations, and a helpful error message is there to help, before "pointing to new mutables", i.e. var mydict =-> {}
becomes a natural thing. Assigning defaults for proc/func variables can work as expected (var=mydict), and if needed as (var=->mydict).
Actually, only a consistent syntax may ultimately allow for a more powerful language, e.g.:
var a =-> {k: 'val'} # new dict
var b =-> a # alias, pointing to same dict
var c = a # separate new dict (internally duplicated datastructure, no copy/deepcopy pitfalls)
I thought about it and brainstormed on Zulip, and I think we can do something similar to what you're saying, basically reuse ->
in 3 places to connote "mutation" or "aliasing"
We already have
call mylist->append(42) # mutation
And I decided against !
because it's a different symbol than ->
. It's weird to have 2 symbols for a same thing -- it looks a little noisy and perl-ish.
So I think then we could have
var a = {k: 'val'}
var b -> a # alias, pretty much what you wrote
setvar b.other = 42 # mutation visible through BOTH a and b
however you can also write it like this
var a -> {k: 'val'} # this is the "first" pointer, not an alias
var b -> a
->
is exactly like =
, but it checks if the RHS is a List or Dict -- a mutable container
And then I think instead of myFunc!(mutated)
or myFunc(!mutated)
, we can simply use the same symbol as an optional prefix operator
call myFunc(->mutated) # creating an "alias" by passing a pointer
clear-dict (->mutated)
Again ->
will check if the value is a List or Dict, so that ->[1,2,3]
is legal, but ->42
and ->"mystr"
are runtime errors
This is not foolproof or a static check, but I think it's a nice way of making code that cares about mutation and aliasing look different
But I'll also note that I expect this to be fairly rare in YSH code, except for library code and frameworks that use metaprogramming
Most YSH code will be simpler transformations on JSON and so forth. Copying files around, and that sort of thing.
When you use JSON, you're creating a copy, so there is no mutation or aliasing.
This is a very "advanced" feature that won't appear in most code
There are a bunch of priorities before this, but I think using ->
consistently everywhere makes sense, and is pretty similar to what you proposed.
we can do something similar to what you're saying, basically reuse
->
in 3 places to connote "mutation" or "aliasing"
Oh, yes reusing ->
is an even better idea!
I noticed that the overview table that I announced here actually was not in the description (anymore), it seems it got deleted when I later updated something in the description from another stale browser window, I'm sorry.
I've now re-added updated tables, and also re-worked all short descriptions that follow the overview with your idea to re-use ->
. That gives a great overall impression.
This is not foolproof or a static check, but I think it's a nice way of making code that cares about mutation and aliasing look different
But I'll also note that I expect this to be fairly rare in YSH code, except for library code and frameworks that use metaprogramming
Hm, I assume most day-to-day assignments are made with atoms (value-copy-behavior), and the amount of aliasing/pointing assignments is quite low (i.e. mostly only the initial declarations, since thereafter mutables get passed along and, well, mutated in place).
And the thing that is really only needed very rarely, or as you say:
fairly rare[ly] in YSH code, except for library code and frameworks that use metaprogramming
is rather assignments that need to take (allow) all types, i.e. the current default =
assignments.
With a generally quite moderate use of aliasing/pointer assignments and a large majority of value-copy-behaviour, I had been thinking to require atoms for =
, and mutable or Place, for ->
, and have =*
to allow all types for use in library code or other rare cases.
So the burden of requiring atoms for =
assignments and thus having a "foolproof or a static" consistent solution may actually not be that heavy at all.
So,
var a = {}
and not require var a -> {}
. But how would you warrant this? Not requiring it would not ensure clearly consistent code, I think.I understand that it should always be possible to add a ->
prefix operator to a function call (once a Place can can be used transparently like the type it refers to), however:
myFunc(->mutated)
even if the definition explicitly required it, e.g. as in func myFunc(->target)
? Not requiring it would not ensure clearly understandable code, I think.sorry for my late input. It. was a bit too much too fast iteration and I didn't really have much of an opinion anyway.
But I really like the reuse of ->
!
I honestly wouldn't even mind that much if it was required (and otherwise a copy would happen) but I think the way it works right now even though "inconsistent" is much closer to how it's gonna be used most of the time anyway so I think it's OK to be optional.
Thanks for the feedback @Melkor333 ! That's useful
@bar-g
->
in signatures is an interesting idea, probably a good one. So we can reuse ->
in 4 places and not 3var x = []
in addition to var x -> []
is simply that MOST usages of lists should not involve any aliasing. You're just creating an argv array and using it in one place, etc. So I think most usages of lists can "pretend" that they are values. If you don't have an alias, then a List behaves just like an Int or Str. Most users will think of it that way ... again the aliasing is an "advanced" feature
Most shell code is pretty straightforward -- like imagine 50,000 lines of shell code to build a distro -- how to download, build, and test. You basically would never use aliases for List or Dict there.
The other thing I want to say is that ->
would still be a dynamic check, and dynamic checks like const
/ readonly
are fundamentally limited
There is also the notion of static type annotations like this
func f(x List[Int]) {
return (x)
}
We can parse that but we don't do anything with it yet, and may not ever. But it would interact with
func f(->x) { ... }
and
func f(->x List[Int]) { ... }
I guess one guideline is
->
in four places->
, because it's a pretty "advanced" usage of shell!The presence of ->
should make you think twice!
Just like in Python, most Dict and List usages are like "values", and you don't worry about aliasing.
But aliasing is extremely useful sometimes, like walking a tree and accumulating values in the tree. That's even popular in Lisp
Hi, sorry the iteration is due to me being new to python things, so I had -- and also thanks to things I learned here -- could re-work the idea, while maintaining an eye on things problematic in general, and new for shell users.
we support writing -> in four places But you should actually avoid using anything that could benefit from ->, because it's a pretty "advanced" usage of shell!
The presence of -> should make you think twice!
Sure, but wouldn't there only be a -> present in an assignment for sure, and wouldn't one only notice that one is creating an alias and not a copy, if -> is required for (pointer/alias) assignments of mutables?
The reason to allow var x = [] in addition to var x -> [] is simply that MOST usages of lists should not involve any aliasing. You're just creating an argv array and using it in one place, etc.
So I think most usages of lists can "pretend" that they are values. If you don't have an alias, then a List behaves just like an Int or Str. Most users will think of it that way ...
Really, no, it should not be hidden that every List or Dict variable only aliases/points-to the List or Dict, even the one used for the definition. It's not good to hide that at all, as List and Dict variables behave very differently! That was the original reason that let to filing several issues here.
The main thing: They are mutated outside of procs/funcs, and that is currently absolutely not obvious, because of a language syntax that "pretends" to do the same, but really does something differently in the background.
When var x = []
fails and hints to use var x -> []
instead, it's immediately clear that x is not the variableList itself, but pointing/aliasing the List. By seeing this, one may already perfectly deduce and understand that procs/funcs may mutate a passed list or dict. That is why I'd say the -> should be required for assignments, it'll make definitions clear, and prevents from unintentionally aliasing, while one may be wrongly expecting value-copy-behavior.
However, in the proc/func definitions like func myFunc(->participants)
, the -> serves to
->
to be present in proc/func callsSo, in the definitions, the presence of -> may only be desired if the proc/func is actually mutating anything. Could there be a "static" check determining if there is a setout participants
call within the proc/func?
setout
keyword for all outside mutables, to make it obvious to mutate external values, even when only assigning atoms to list or dict members. Because there is no need to use the thin arrow in these kind of assignments: setout participants[key] = 'value'
.However, at the proc/func call sites, I think the -> should always be required if present in the definition (i.e. required by it). May this be a "static" check?
A part of the idea that seems has gotten lost a bit by the mangling edit:
Using ->
in proc/func definitions/calls would always mean using a Place (adding one if needed) for consistent behavior accross all types. And passing a plain, non-rebindable mutable would become a rarely needed special case (->:
).
I think at this point it's mostly differing opinions
The status quo is inconsistent but convenient and the question is if we weight consistency more than convenience. A compromise is making it optionally consistent - which I honestly don't like that much, it makes code from 2 people look different.
IMO both options make sense and ->
seems consistent enough (and only 1 char more) that I'm personally fine with enforcing it. But it's still a bit unusual - i don't think ->
is used anywhere else for assignment? I'm also not really aware of all the consequences of such a breaking change...
[optionally consistent] makes code from 2 people look different.
Hm, yes. So maybe rather a plain strict_type_interaction
? option, allowing to experiment with it and making it a default option in ysh if it works out as expected.
I could imagine that it's actually more convenient to have it enabled, because then the language is unobtrusively showing the difference, exactly in the relatively few situations when things actually behave differently than usual (i.e. aliasing/pointing way instead of the usual default copy-value-behavior).
To assess it better, what would be significant impact examples of really loosing convenience?
Some initial List/Dict definitions? Proc/func definitions or calls?
I've augmented and restructured the first table in the description (now 5 places of recognizable mutating behavior) and noticed a slight inconsistency:
"Within procs/funcs" the setout place = 'value'
does not show the common ->
, it's implicit on the left hand side and I think that makes it simpler to use and think about.
But what would you think about the idea to already use the deviced proc/func naming convention for shortening the setout
keyword itself?
For example:
set-> place.key = 'value' # (mutation)
set-> place = 'value' # (rebind to string)
set-> place -> [ 'one', 'two' ] # (rebind to list)
the way it works right now even though "inconsistent" is much closer to how it's gonna be used most of the time anyway so I think it's OK to be optional.
If by that you mean that external mutables are set from within procs/funcs still with an inconsisten "covert" setvar mutable[key] = "value"
, I don't think that would need to be true.
Because that would then be an error, hinting to use the shorter, external: set-> mutable[key] = "value"
.
[EDIT:]
Quite rare, local places: setplace place[key] = "value"
.
While locally declared mutable vars continue to work with, as always:
setvar mutable[key] = "value" # mutate
setvar mutable = 'string' # re-bind
The details will have to be something we work out when we do it
I think we should use ->
more, but it will have to wait awhile, since there are lots of other things in YSH to do like the flag parsing, unit testing, module system, fixing C++ bugs, etc.
I will open a new bug with a rough idea -- thanks for brainstorming
Closing in favor of #1831
Successor issue (but lacking the overview/tables):
Discussion: https://oilshell.zulipchat.com/#narrow/stream/433024-low-priority/topic/value.2EPlace.20feedback.20.2F.20integrate.20rebinding.20and.20mutability
Basic Problem:
setvar b = a
assignment syntax does different things, depending on the type of the variable.Discussed:
New option
strict_type_interaction
Table of Behavior (Overview)
->
is mutating call (chain)call var->myFunc('update')
->
-suffix in their name to denote mutatingif (error) { reset-app-> }
proc apply-update-to (; x, ->y )
set->
keyword is allowed to mutate an outer ->passed_var (Place or mutable), and returns error if var was not required as->*
or?*
in proc/func definitionset-> passed_var[key] = 'value'
(implicit->
on the passed_var) (setplace
for local places)... \| read --lines (->x) ; apply-update-to ( x, ->y )
List of Variable-Passing Behavior-Prefixes (implicit type-restrictions)
var
(no prefix):var
->var
->
)->:var
?var
List of Apparent Assignment Behaviors
Short Documentation of Affected Syntax
At the proc/func definition-site:
At the proc/func call-site:
Specifying defaults in the proc/func definition:
Assignment keywords
discouraged/not needed:
Explanation of the idea in coherent words
A shopt
strict_type_interaction
enables slight syntax additions with type requirements to reflect different behavior.Basics
->
already means mutation (call var->myFunc()
), but what to do for command language or function calls in expression syntax that don't pass a variable? Allow, warn, or require procs/funcs to be named with a->
-suffix? (call reset-app->()
), to have an indication that it will mutate a variable when called?Variant 1: "Value-copy-behavior, by-default."
->
) for mutables and value.Place, and indicating behavior that varies depending on variable type (=*
). While the classical=
only accepts "atom" types, thus indicating value-copy-behavior.->var
) whose referenced type itself can also be changed by the proc/func, or (rare) for choosing an explicitly fixed-type mutable[container] whose type can't be changed (->:var
). (The:
is more like assignment, and already used for the mutable type List literal:|...|
, what could allow for a shorthand syntax sugar->:|one two|
.) And also require to indicate inconsistent behavior that varies depending on the variable type (?var
).set-> local_varname =
), and also a "local" alias for it to use on the top-level, or if accessing only a temporarily used locally created place (setplace local_var =
).Variant 2: "Pointer-assignment-behavior, by default."
This could possibly use a
=:
operator to require value-copy assignments, and a prefix of:...
in proc/func definitions and calls, besides having the inconsistent operator=*
and variable prefix?...
.Variant 3: "Only optional, specific syntax and behavior."
This would keep the inconsistent behavior of the
=
assignment operator, but add=:
for value-copy and->
for pointer-assignment.Originally posted by @bar-g in https://github.com/oilshell/oil/issues/1791#issuecomment-1893356505
Minimal example showing that the original variable is being changed:
For the better or worse, mutating the func's local "input" variable within the func actually also mutates the original global dict! (As seen after the function has run.)
This behavior, and the the "same-as-for-var-mutation" syntax, actually seems exactly as wished for for "out" variables in https://github.com/oilshell/oil/issues/1789 , but I really expected that general passing of variables to procs or funcs works through call-by-value (i.e. on a copy) not call-by-reference, i.e. to not mutate the original variables that were passed, at least by default.