modularml / mojo

The Mojo Programming Language
https://docs.modular.com/mojo/manual/
Other
23.26k stars 2.59k forks source link

Replace `self&` syntax with `inout self` #7

Closed elliotwaite closed 1 year ago

elliotwaite commented 1 year ago

What’s the motivation behind the choice of the self& syntax? Coming from Rust and C++, the &self syntax seems more natural to me. And having the & be a prefix seems like it would align more with how borrowed and owned are placed before the argument name.

lattner commented 1 year ago

the rationale is to get it away from the prefix * and ** sigils used for variadics. I agree with you though that it takes some getting used to. An alternate approach would be to dump the & sigil and switch to a keyword like inout. That would be more expressive and align better with borrowed and owned.

elliotwaite commented 1 year ago

Ah, I see. Hmm, I feel like it could still work on the left even with variadic arguments. When I compare the options below, I personally prefer the first option.

fn my_func(&*args, &**kwargs):
    pass

fn my_func(*&args, **&kwargs):
    pass

fn my_func(*args&, **kwargs&):
    pass

fn my_func(inout *args, inout **kwargs):
    pass

But you've spent much more time with the new syntax, so if you think the postfix style is the way to go, I'll trust your intuition. However, if you are open to getting more feedback, it might be a good idea to run a community poll to see if there is a strong preference for one of the alternative options.

Andriamanitra commented 1 year ago

Another benefit of using a keyword instead of a symbol is that they are much easier to look up when you don't know what it means. If you search for "ampersand in Python" you get results about the bitwise operator using the same symbol, which can be a little bit confusing especially for new programmers.

I would prefer ref (as in reference semantics, a concept every Python programmer should already be familiar with) over inout though. ref is also already used in other languages, notably D and C#.

lsh commented 1 year ago

For what it's worth, I think the postfix is nice, especially if it coincides with some sort of postfix operator. Similar to how Rust introduced postfix .await and zig had postfix .*.

aaron-foreflight commented 1 year ago

Another benefit of using a keyword instead of a symbol is that they are much easier to look up when you don't know what
I would prefer ref (as in reference semantics, a concept every Python programmer should already be familiar with) over inout though. ref is also already used in other languages, notably D and C#.

inout is a Swift-ism (among other languages probably). I like it, but I have been a Swift developer for quite a while now.

dom96 commented 1 year ago

An alternate approach would be to dump the & sigil and switch to a keyword like inout. That would be more expressive and align better with borrowed and owned.

Agreed and I would argue this would also be more pythonic, seeing as Python prefers keywords for common operators in many cases (primary example being boolean operators). The & sigil is so heavily overloaded across different languages that reusing it just adds to confusion IMO.

lattner commented 1 year ago

Right, our equivalent of std::move is the consume operator, which is postfix^ so it composes correctly on expressions. The need there doesn't apply to the & in this case, since it isn't in an expression context. Using a keyword seems better to me, and I agree & is massively overloaded. Any concerns with going with inout?

elliotwaite commented 1 year ago

I like the conciseness of &, but I agree that inout would be more aligned with borrowed and owned. Also, inout might be more intuitive (it says what it does, hinting more directly at the mutability aspect of the reference, whereas & makes me think of a reference but not necessarily a mutable reference). I’d be okay with inout, but that’s just me.

nmsmith commented 1 year ago

I'm worried that learners will find the keyword inout just as opaque as the & character. IMO, we should try and find a keyword that captures the essence of what it means for a function to have a "mutable reference argument".

So what is a mutable reference argument for? It's for mutating a piece of the caller's state: the piece exposed by the reference. A good keyword would reflect this. Unfortunately mut already means something different in Rust, so it's probably not the best choice. Consulting a thesaurus, possible synonyms for mutate include:

IMO, the last two are promising candidates for keywords. "modify" can be made a keyword if it's abbreviated with mod, as in "I want to mod my car" or "I want to mod this video game". This conflicts with Rust's use of mod, but not in a problematic way — functions typically don't accept modules as arguments.

Personally, I really like the edit keyword, because it doesn't need to be abbreviated, and the word has a clear and unambiguous meaning that conveys the purpose of mutable reference arguments. Here are some code samples using this keyword:

fn swap(edit x: Int, edit y: Int):
    let tmp = x
    x = y
    y = tmp

struct Foo:
    def __init__(edit self, x: Int, y: Int): ...
    def __copyinit__(edit self, existing: Self): ...
    def __moveinit__(edit self, owned existing: Self): ...
    def __del__(owned self): ...

This keyword has good verbalizability, which is known to be helpful for language learners. For example, the function signature:

can be verbalized as:

Also, to align a bit better with edit, it might be worth considering using the keyword own for ownership, such that both keywords are present-tense verbs. Then we could verbalize the following function signature:

as follows:

Finally, to align with this proposed naming scheme, perhaps the implicit keyword for immutable references could be view, as in "I can view this, but not edit it". (Having this available as an optional keyword could be really useful as an on-ramp for learning, even if it is omitted in production code.)

This naming scheme might be helpful/intuitive when explaining Mojo's ownership system, in documentation and error messages:

For references stored in structs as opposed to functions, we can instead say:

In other words, view also works well as a noun, for stored references.

Those are my thoughts :slightly_smiling_face:.

lattner commented 1 year ago

Thanks. I think the term "inout" is actually the technically correct one. Many values are passed by-pointer-in-memory, e.g.:

fn takeInt(inout x: Int): ...
fn callIt():
  var x = 42
  takeInt(x)

But this is only one kind of lvalue. Mojo supports getitem/getattr, and when you define and use a collection with them, e.g.:

fn callIt(arr: SomeArrayOfInts):
  takeInt(arr[42])

Mojo has to do ~this:

fn callIt(arr: SomeArrayOfInts):
  var tmp = arr.getitem(42)
  takeInt(tmp)
  arr.setitem(tmp, 42)

So yes, in and out is what has to happen.

nmsmith commented 1 year ago

Isn't the term edit consistent with this behaviour? The end result of executing the above program is that the value of arr[42] is edited. Most users won't need to be concerned with the fact that the editing is performed via calls to getitem and setitem.

IMO we want a keyword that conveys meaning to people who are trying to learn the language. inout doesn't really convey any meaning, other than the vague idea that "something is going in, and something is coming out", which isn't particularly illuminating. Any tutorial that explains inout is going to rely heavily on the word "modify" or "edit" to explain what inout is for.

I've taught a lot of university students Python as a first language, and from what I've seen, the biggest challenge for learners has always been developing the right mental model about how a program works, and selecting appropriate constructs for the task at hand. Understanding the purpose of each language construct is a prerequisite for that. Keywords that illuminate their purpose are typically the most easily understood, e.g. most people tend to understand if and else quite easily because their semantics in Python matches the meaning of the words in English. We could use the keywords branch and alt instead, but they introduce an extra "decoding step" — students have to mentally translate such keywords back to their meaning: "Oh, if I see 'branch' that means that IF the condition is true, the code will be executed."

I'm worried the same mental translation will be required for inout:

I know inout makes sense to a veteran of Swift who is already accustomed to the keyword, but I hope that's not the reason for favouring it. I would hesitate to prioritize veteran programmers (who will easily understand that edit works the same as inout) over novice programmers (who will have no idea what inout means).

I'm not fussed on the particular keyword that is chosen... I just think it should help users figure out how to think about the construct. I don't think inout achieves that. A different keyword might achieve it, and edit is just my best idea so far.

Perhaps I'm mistaken — perhaps edit would mislead users as to the purpose of the construct? Should users be internalizing a different mental model for & arguments?

(Sorry, I didn't mean for this to be a big spiel! 😇)

nmsmith commented 1 year ago

And I agree Chris that your array example is actually a great reason to avoid the term "reference" when describing Mojo arguments. As you've shown, "mutable reference arguments" aren't necessarily references, and "immutable reference arguments" aren't necessarily references either, owing to @register_passable. So I'm totally in favour of avoiding the term "reference" when discussing Mojo arguments.

The terms I proposed would be consistent with this perspective. They offer a mental model that has nothing to do with reference-passing 🙂.

lattner commented 1 year ago

I don't think there is a perfect term here, but I've never seen precedent for edit being used in a PL to mean this, and I can see lots of other theoretically confusable meanings to the word edit. The semantics here aren't to make the value mutable, it is to make it mutable, have the changes visible on the callers lvalue, and therefore requires an lvalue on the caller side. edit could easily be interpretable as just "mutable in the callee" like owned provides, and doesn't imply there is a copyout.

inout directly explains the behavior of the construct. While there is some theoretical confusion, it is unambiguous, provides the right mutability implications, and provides both a "copy in and out" semantic and directionality implication that edit does not provide. It is also precedented in at least two widely used languages.

In any case, I think that inout is strictly better than &, so I'm just going to go with that. This is certainly not a one-way door: As the language evolves more, we should come back to see how the whole inout/borrowed/owned family sits together with the broader lexicon of the language as it develops.

Thank you all for pushing on this! I added this to the changelog and it will go out whenever the next update ships.

nmsmith commented 1 year ago

I've never understood appealing to precedent as a justification for not attempting to find a better design.

I hope — as you say — it will be possible to re-evaluate these keywords in the future. People tend to get attached to designs they are familiar with, so the longer that we stick to a particular keyword, the more resistance there will be to change.

I look forward to seeing what the future holds. Perhaps I'm just failing to understand the essence of inout arguments right now.

nmsmith commented 1 year ago

Just for future consideration (if we ever come back to this discussion), one way to understand the edit metaphor in terms of the copy-back behaviour would be to think of it like editing a document. When I open a document for editing, it loads a copy into memory. Consequently, edits that I make to the document are not automatically saved to disk. Instead, the changes are only saved if I hit "Save", or when the document is closed. This metaphor seems to match how an inout argument works: changes to an inout argument are only visible to the caller (i.e. "saved to disk") when the function call is closed.

Thus, I think the edit metaphor can work great, even given the copying behaviour of inout. I can imagine a future Mojo tutorial having an infographic that uses the document-editing metaphor to explain how inout works.

Food for thought @lattner 😛.

lattner commented 1 year ago

I've never understood appealing to precedent as a justification for not attempting to find a better design.

I don't think that's a fair characterization. Precedent is one tiny piece of the rationale I shared above. In any case, you're right, we should continue to evaluate. Using any keyword is a good step forward here.

nmsmith commented 1 year ago

You're right, it was unfair of me to only respond to that one part of your message. Sorry.

bnyu commented 1 year ago

For what it's worth, I think the postfix is nice, especially if it coincides with some sort of postfix operator. Similar to how Rust introduced postfix .await and zig had postfix .*.

+1 the postfix is just fine compare with adding inout move consume etc...

bnyu commented 1 year ago

Also Python chosen *prefix represents variadic args instead use key words (eg: Kotlin's varargs), So maybe postfix& postfix^ is more consicen with Python. Even it is not that modern way like Swift @lattner

ps6067966 commented 1 year ago

I like ref keyword. List or arrays are pass by reference by default in dart. is it same for mojo?

gavlooth commented 1 year ago

Actually i haven't seen inout before in my life and it's hard to wrap my head around it's meaning. ref makes more sense as does edit. At the very least they are much more common.

aaron-foreflight commented 1 year ago

Actually i haven't seen inout before in my life and it's hard to wrap my head around it's meaning. ref makes more sense as does edit. At the very least they are much more common.

Ref does not make sense though because you are passing a value type and not a pointer reference. Edit does seem like an ok name though. However, the precedent for inout already exists and is widely used in Swift. I am pretty sure this inout is intended to work like it does in Swift.

You can read more about how Swift uses it here: https://docs.swift.org/swift-book/documentation/the-swift-programming-language/functions/

This is a pretty good discussion on how it is used in the Swift community if you want to see a real use case: https://github.com/pointfreeco/swift-composable-architecture/discussions/2065#discussioncomment-5851560

lattner commented 1 year ago

We'll need to repaint keywords when lifetimes come in, please pause bikeshedding on this for the next few weeks :)

nav9 commented 1 year ago

I reached this page while searching for "what is Mojo inout". A humble request to the Mojo team: Please always keep in mind that programmers work with multiple languages. As we grow older, it becomes more difficult to learn the nuances of a new syntax. It's like being expected to learn French for a project and then having to learn Chinese for another project and then Malayalam for the next project. I wrote about this in 2014, and I'd like to continue to emphasize that the complexities of a language can be hidden away and handled intelligently by the compiler/interpreter, so that the programmer can work with simple syntax and semantics. I'm actually looking forward to a time when we can just talk to the IDE and tell it to create functionality, instead of having to program it ourselves.

Lattner's point on bikeshedding is noted. I know there are bigger fish to fry, but this was to put my two cents across.

_ps: It might be great for SEO if Mojo was renamed to a more unique name. When I searched for "what is Mojo inout", I hoped to reach the Mojo documentation page. Even other searches for "Mojo" lead to all kinds of other websites named Mojo._

sweihub commented 1 year ago

Mojo is more like Python + Rust, if mojo already overcomes the Rust borrow checker issue, why not stay with simple, just reserve the self as keyword, and it always passes by mutable reference?

Honour-d-dev commented 1 year ago

The inout keyword does take some getting used to to be honest. Personally i think we should stick to ref and explicitly include mutability with mut, i feel that is a variable is mutable it should explicitly say so. i think ref own and mut should replace borrowed own and inout because they are more explicit and can be combined like so;

fn add(x: Int, mut y: int) -> Int:

fn add(ref x: Int, ref mut y: Int) -> Int:

fn add(own x: Int, own mut y: int) -> Int:

where passing by "ref" is the default if a data type does not implement __copyinit__ ( i think __copy__ is less verbose and more straight forward btw) and owned variables should not be mutable by default ,as is the case currently.

Honour-d-dev commented 1 year ago

The inout keyword does take some getting used to to be honest. Personally i think we should stick to ref and explicitly include mutability with mut, i feel that is a variable is mutable it should explicitly say so. i think ref own and mut should replace borrowed own and inout because they are more explicit and can be combined like so;

fn add(x: Int, mut y: int) -> Int:

fn add(ref x: Int, ref mut y: Int) -> Int:

fn add(own x: Int, own mut y: int) -> Int:

where passing by "ref" is the default if a data type does not implement __copyinit__ ( i think __copy__ is less verbose and more straight forward btw) and owned variables should not be mutable by default ,as is the case currently.

Others already explain here the why avoid the term ref or reference but I will use your example to explain on my way.

The problem with ref or reference name is that every variable (except those that store raw values, not address) is a reference to an address in the memory which a structured value is stored. So even owner variables ALSO are references and this makes the things confused . Because this, Rust and now Mojo use the term borrowed , to differentiate 'a borrowed reference to memory' from 'a owned reference to memory'.

For example, in something like that:

fn foo(own x: Int, ref y: Int) -> Int:

Both x and y are references to memory but only y is expressed as ref. Confused, right?

Thus, borrowed enters here to make the things more appropriate:

fn foo(owned x: Int, borrowed y: Int) -> Int:

or we can think of a reference as a variable that does not own the value its holding. Since everything in python is a reference by default (which i didn't know at the time , as i've never actually used python before lol) the ref keyword can be used to say ,this variable references a value it doesn't own, or a symbol can be used instead.

nav9 commented 1 year ago

"in-out" happens to have a certain unpleasant meaning. Anyway, I'd like to re-iterate the need to create syntax that's simple and familiar. Not from a compiler-developer's point of view, but from the point of view of the end User. Because it's us who have to learn and re-learn so many programming languages. It gets a lot tougher as we get older.
Some examples of the need for simplicity and familiarity:

ps: Very happy to be able to run Mojo locally. Wasn't expecting it so soon. Great job, and thanks!

drunkwcodes commented 1 year ago

@nav9 What a sxxtty dict! It tends to make me laugh every time when I see inout!

Nonetheless, inout is as concise as Chris Lattner says, and swifty as well. We should learn it if we are not familiar with that.

But still, ref is good to me, combining with const. So we can use ref and const ref, even & suffix such as self& like C++'s good ol' days. Though they are no help with lifetime design, maybe we can work it out.

Maybe we can have a poll for this. But I still respect Modular's team's decision most. Before Modular teams reply, we should stop bike shedding. The options(opinions) are more than enough.

david-ragazzi commented 1 year ago

But still, ref is good to me, combining with const. So we can use ref and const ref, even & suffix such as self& like C++'s good ol' days.

Use const is invert the Mojo design choice for mutability by making the variables mutable by default. As every argument in a function would be mutable reference, the user would have to put const ref in every declararion or function signature or assume that every variable/argument is mutable. This breaks the pratice of use immutability as default to improve the safeness of the code or could polute the code with const ref everywhere.

david-ragazzi commented 1 year ago

@david-ragazzi Oh no no no no...

Your suggestion is against swift and C++ camp. I have no idea how you will become a mojician. The point is an alternative of inout is not swifty, absolutely.

The inclusion of your advice will make mojo a broader spectrum than LGBT, IMO.

First all, you didn't understand what I meant. In Rust every variable is immutable by default which is different than C++. Then this "I have no idea how you will become a mojician" sounds your ignorance and that you don't respect your colleges

And it is not my advice, it's what I'm infering from Mojo manual. If Mojo will include or not a mut keyword, it's not clear. But from manual, these 3 keywords as are, seems discard an extra mutability keyword.

drunkwcodes commented 1 year ago

@david-ragazzi First of all, what I major in maybe has a latex dependency. It's hard to explain to you in regard of lacking humor... It's you don't respect others' experiences, even yourself's.

Second, the immutability is discussed in https://github.com/modularml/mojo/issues/451. The most recent reply is "stop rustifying mojo". But I still welcome you to join that thread.

I also don't believe the immutable assumption about variables by the rustaceans like you. You should make statistics to prove that.

In the meantime, the lifetime is under design phase. The goal is better than rust. It is clear.

Last but not least, inout for the win, overall.

david-ragazzi commented 1 year ago

@david-ragazzi First of all, what I major in maybe has a latex dependency. It's hard to explain to you in regard of lacking humor... It's you don't respect others' experiences, even yourself's.

Second, the immutability is discussed in #451,. The most recent reply is "stop rustifying mojo". But I still welcome you to join that thread.

I also don't believe the immutable assumption about variables by the rustaceans like you. You should make statistics to prove that.

Last but not least, inout for the win, overall.

No. You are being disrespectful not only to me but to others here as well. You laughed at @nav9's opinion. You think that only your experience and language choices are the best ones, mocking other's suggestions. If you like C++, stick with it, just don't belittle your colleagues for trying to bring their ideas. And if you disagree someone, criticize the ideas but don't use an ad hominem argument.

drunkwcodes commented 1 year ago

@david-ragazzi I give you plenty of references but you didn't take that. Only arguing with me with wrong points... And I don't disrespect @nav9. I agreed with him partially.

Your reponsibility is more than a compiler. We should continue at discord, IMHO. https://discord.gg/YE2RV5hk

Here is discord mojo-chat channel. Please join in or DM me, thanks.

david-ragazzi commented 1 year ago

It seems that other people also think the same than me (about your unrespectful behavior) in the topic you posted (#451).

This said: No, thanks. I'm done with you.

nav9 commented 1 year ago

Peace, peace, good fellows. @david: Like drunkwcodes said, it was genuinely being said in humour. Please don't take it personally.

blagasz commented 6 months ago

( Btw (I see this is a bit old and also not extremely forward pointing, but) why not use the same keyword logic instead of the postfix operator ^, for me in would be significantly more pythonic to have something like

take_text(move message)

Is that a terrible idea? Or is it discussed already somewhere I've missed? )

Found it :)

https://github.com/modularml/mojo/issues/1279

u3Izx9ql7vW4 commented 2 months ago

inout directly explains the behavior of the construct. While there is some theoretical confusion, it is unambiguous, provides the right mutability implications, and provides both a "copy in and out" semantic and directionality implication that edit does not provide. It is also precedented in at least two widely used languages.

@lattner I strongly disagree with your sentiment above. Other than developers with Swift backgrounds, the inout parameter is completely obtuse, ie since when does "in" automatically imply "copy in", as opposed to "reference in" or "pointer in".

One of the most common words used in describing inout in every discussion is "mutability" and its derivatives, which characterizes the primary behavior of inout variables. The section explaining inout is literally titled Mutable Reference. The functions in the code examples illustrating inout's purpose are literally called mutate . So why not just use mut or mutref or mut ref or whatever.

The inout parameter would be understandable if those other names area already in use in Mojo's lexicon, but since they're not, inout is so unnecessary.

Considering that you're behind Swift, and felt compelled to bring over the inout parameter over to Mojo, I suspect my words won't mean much. I felt you should hear my rant if you gonna make me write inout and cringe at every sight of it.

There's another thread on this I found: https://github.com/modularml/mojo/discussions/1463