hsutter / cppfront

A personal experimental C++ Syntax 2 -> Syntax 1 compiler
Other
5.48k stars 242 forks source link

[SUGGESTION] Unifying various "const"s #255

Closed AbhinavK00 closed 1 year ago

AbhinavK00 commented 1 year ago

The idea is to reduce concept count by unifying constinit and consteval, and possibly change our current view of const.

Firstly, I suggest that bindings should be constant by default. Herb has an article about it in Design notes and in that article, his answer was "Mostly yes". If mostly, why not all? He mentions three places in the article where const by default is seen, I think making users write one extra keyword in one more place won't impact the code much .There's a reddit discussion about this which I suggest looking into (https://www.reddit.com/r/cppfront/comments/xwk42s/herbs_current_view_on_const_by_default/).

Next up, if we get const by default bindings (not variables), what kind of syntax would it need?

foo : mut _ = 2; //corresponds to currrent syntax
foo : var _ = 2; //again corresponds to current syntax but with different keyword
var foo := 2 //corresponds to how we decorate functions with constexpr or consteval in cpp unless Herb has plans to change it in 
                   // the future

I suggest the last alternative as a binding being mutable or immutable should NOT be part of the type,int is type but const int is not and so shouldn't be mutable int or variable int.

Next up, as constinit and consteval do not have overlapping use, both can be unified under one construct const so that use of const with a variable produces constinit in output and the same with a function produces consteval. If this is implemented, the cpp2 code will look something like this:

CPP2 CODE                                                            GENERATED CPP CODE
a := 2;                                                              auto const a {2};
var a := 2;                                                          auto a {2};
const a := 2;                                                        auto constexpr a {2};
const var a := 2;                                                    auto constinit a {2}; //does not compile

const a : () -> int                                                  [[nodiscard]] auto consteval a () -> int;
                                                                     //not sure where consteval would go
const { }                                                            consteval { }
if const { }                                                         if consteval { }

Notice how const var spells constinit and we don't have to teach that constinit is just constexpr but mutable. We only have to teach that const means things done at compile time, what things? -- for bindings, their INITIALIZATION is done at compile time. -- for functions, their EVALUATION is done at compile time.

Issues with this proposal:

I have no idea how hard implementing something like this would be, having to know which cpp2 const corressponds with which cpp const__ could be hard to implement in a parser. You must have noticed how I have said nothing about constexpr with functions and to be fair, I have no idea. One way could be just to keep the constexpr keyword in but not allow it for bindings (since there will be a way to declare constexpr bindings).

Maybe this idea is just too raw to be implemented or could be more refined and discussed about before implementing, or maybe it's just not good enough, idk. Express your opinions please!

filipsajdak commented 1 year ago

Hi,

Thank you for your suggestion. My biggest concern is that it breaks the left-to-right approach, making it context-dependent and against the goals for the cppfront experiment. (check: https://youtu.be/ELeZAKCN4tY?t=1070, and https://github.com/hsutter/cppfront/wiki/Design-note%3A-Postfix-operators).

Also, Herb will appreciate sticking to the Suggestion template he creates:

Will your feature suggestion eliminate X% of security vulnerabilities of a given kind in current C++ code? If yes, please be specific about the classes of bugs that would go away, with an example or two (especially a link to a real CVE or two).

Will your feature suggestion automate or eliminate X% of current C++ guidance literature? If yes, please be specific about what current good guidance this helps make the default, and/or what guidelines we would no longer need to teach/learn or that would be simplified and how, with an example or two (especially a link to a real "Effective C++" or "C++ Core Guidelines" guideline or two). For ideas, you can refer to my CppCon 2020 talk starting at 10:31 where I summarize a categorized breakdown based on over 600 C++ guidance literature rules I cataloged and analyzed.

Describe alternatives you've considered. There's nearly always more than one way to improve something. What other options did you consider? Why is the one you're suggesting better than those?

AbhinavK00 commented 1 year ago

Sorry, I made it look like I'm asking for syntax change but the motive was to make it so that one keyword does things to replace both constinit and consteval, to reduce concept count. Herb will support those sooner or later and they'd have to be written out somewhere in the code, so why not as one.
Also const by default, the reddit thread has some nice arguments why it should be like that.

jcanizales commented 1 year ago

I feel like const a := 2; meaning constexpr and const var a := 2; meaning constinit would be more clearly expressed as:

a := const {2};  // The thing inside the brackets is evaluated at compile time.
                 // The binding is immutable as usual.
var b := const {2};  // The thing inside the brackets is evaluated at compile time.
                     // The binding is mutable as indicated by var.

Tangentially it would restore left-to-rightness.

AbhinavK00 commented 1 year ago

Yes, this looks natural while also preserving left-to-rightness (idk why I didn't come up with it in the first place). Corresponding to this, functions could be written as: func : () -> int = const { }

filipsajdak commented 1 year ago

Left-to-right in the current syntax will look more like:

a := const {2};
b : var _ = const {2};
AbhinavK00 commented 1 year ago

That'll work too, my suggestion was about const, I made a mistake bringing syntax into the mix.

CPP2 CODE                                                        GENERATED CPP CODE
a := 2;                                                          auto const a {2};
a : var_ = 2;                                                    auto a {2};
a := const{2};                                                   auto constexpr a {2};
a : var_ = const{2};                                             auto constinit a {2}; //does not compile

a : () -> int = const {};                                        [[nodiscard]] auto consteval a () -> int {};
                                                                 //not sure where consteval would go
const { }                                                        consteval { }
if const { }                                                     if consteval { }

^this would be as good as the other one, though the one above looks better

OceanAirdrop commented 1 year ago

There's a reddit discussion about this which I suggest looking into (https://www.reddit.com/r/cppfront/comments/xwk42s/herbs_current_view_on_const_by_default/).

Looks like Herb has already seen that redit discussion as he replied to one of the comments 4 months ago

filipsajdak commented 1 year ago

I was thinking about context-free parsing... my first impression was that

var x : int;

Is wrong until I thought about argument passing styles:

fun: (inout x : int) = {
// ...
}

inout already informs that x is a variable that can mutate. What if we will accept the following:

inout x : int;

Of course, inout is a good name for function argument and maybe not so good for local variables, but it makes me think there is some rule behind this syntax.

In the last weeks, I was astonished by the context-free meaning of cppfront syntax. E.g.

fun( : std::vector = (1,2,3,4) ); // unnamed vector, aka temporary variable
gun( : (x)         = x + 2;    ); // unnamed function, aka lambda

That generates the following cpp1 code:

fun( std::vector{ 1, 2, 3, 4}           ); // unnamed vector, aka temporary variable
gun( [](auto const& x) { return x + 2; }); // unnamed function, aka lambda

Should we extend the argument passing into local variables? Then we could have one rule for variables and argument passing. I don't know yet. I like consistency and one rule instead of two.

filipsajdak commented 1 year ago

I think that argument passing is the only place where the left-to-right approach is stretched.

AbhinavK00 commented 1 year ago

How about :

fun : ( mut x : int ) -> double = { }  // just inout spelled differently
mut a : int     // consistent and conveys meaning better

mut was suggested in yet another reddit thread which suggests keywords for parameter passing that convey meaning in a better way and also a way to mark parameters at callsite (https://www.reddit.com/r/cppfront/comments/yt4cye/cppfront_callsite_parameter_specifiers/).

As about

fun( : std::vector = (1,2,3,4) );

this looks so ugly but maybe once Herb implements classes, we will have a better way to do the same.

I agree with your remark about context free grammar, cpp2 focuses on left-to-rightness which is not same as context free grammar. cpp2 syntax is VERY consistent which is a good thing but sometimes it conflicts with other things (like treating const as part of the type) and maybe those decisions could be revised.

and hey! @filipsajdak , your idea of unifying inout and (not-yet existent) var is same as what I wanted to do with constinit and consteval.

Edit: The reddit thread has been edited and now proposes mod instead of mut, adding one more option to consider. mut is the best IMHO

filipsajdak commented 1 year ago

Regarding the ugliness of

fun( : std::vector = (1,2,3,4) );

That was my first reaction, but I liked it more when I realized that it was consistent with defining variables.

v    : std::vector = (1,2,3,4);   // named vector
fun( : std::vector = (1,2,3,4) ); // unnamed vector, aka temporary variable

f    : (x)         = x + 2;       // named function
gun( : (x)         = x + 2;    ); // unnamed function, aka lambda

Regarding mut, @hsutter already take it into account: https://github.com/hsutter/cppfront/pull/198#issuecomment-1374969617

For inout (or mut) we probably want to know that we're calling an inout- or possibly move-declared parameter

and he mentioned it in https://github.com/hsutter/cppfront/wiki/Design-note:-const-objects-by-default

mut is kind of cute as a word, and if it works consistently throughout the language and not just at function scope then I'm open to it

AbhinavK00 commented 1 year ago
fun( : std::vector = (1,2,3,4) );

For examples of this kind, I was thinking along the lines of some static member function that does the work of a contructor but I don't wanna say anything before Herb is done with implementing classes. Just think of something like:

func ( std::vector::create(1,2,3,4), other_arg );

But again, don't wanna say anything prematurely. I think of : type = as some kind of binding (to a constant, variable, function, class etc) and here we are not doing any binding, we are creating an rvalue and passing it along.

Also, I really hope Herb goes forward with mut and possibly other keywords in that reddit thread (they seem more intuitive).

I would again write a summary of what is proposed but I'm not sure if I should go with

a : mut _ = 2;
// or
mut a := 2;
jcanizales commented 1 year ago

I don't see the parallelism between argument passing styles and local variables to be deeper than "both have a const and a mutable case". For arguments you also have out and forward. And while inout spells out the mental image that you're passing information into the function and getting information out of it, that doesn't make sense for local variables. On the other hand, mut for argument passing doesn't match out the way inout does, and one could fairly ask "isn't out mutable too?"

AbhinavK00 commented 1 year ago

Reddit thread suggests init (short for initialize) instead of out.I agree that it doesnt match with mut the way out matches with inout but it conveys meaning in a better way. You are passing an unitialized variable that you want the function to initialize.

Herb mentions in https://github.com/hsutter/cppfront/wiki/Design-note:-const-objects-by-default

I'm not attracted to that mainly because they feel like they're adding concept count... I'm trying to avoid special one-off features that work in only one part of the language.

While one could argue that in, out and inout are also one-off, we're suggesting mut because it can be used at used two places and definitely does a better job at conveying meaning.

jcanizales commented 1 year ago

I don't think "mutable" conveys the meaning of "the caller will pass a value to the callee, which will write a value back for the caller" better than inout does. But I grant that it's subjective, and it's not a hill I would even bother fighting on.

in, out, inout, and forward are definitely one-offs that work only in one part of the language. But they do a gigantic job of simplifying the current state of affairs. And they also remove concepts: at least const T&, T&&, and const vs non-const member functions are gone thanks to them. In contrast, changing variable declaration from a : T/a : const T to a : mut T/a : T or a : var T/a : val T is purely cosmetic (other than changing if the default is const or not).

Using mut in those two places (variable declarations and instead of inout parameters) reduces the number of keywords (well, not really, as it introduces one and removes one, but let's imagine it did). But it doesn't reduce the number of concepts. Because (to the depth that I can see so far) variable declaration and parameter-passing style aren't two materializations of the same underlying thing:

Maybe those differences can be overcome, with a new mental model and nomenclature that unifies variable declaration and parameter declaration. But without that, I think reusing a keyword in both for concepts that are slightly different isn't a win; it just risks confusion.

Now all of this is orthogonal to getting rid of constexpr, constinit, and consteval by just having a way to mark "this block/expression is evaluated at compile-time" (which we want anyway, for metaprogramming) and composing it with the already-existing "this variable is constant or mutable". That one is an undeniable win in simplicity.

AbhinavK00 commented 1 year ago

I can't help but agree with you. The case with mut is just not-so straightforward ig, parameter passing and variable declaration are different but also similar in a way that deciding if one keyword could serve the purpose of both becomes VERY hard.

BTW, C++23 is here!!

filipsajdak commented 1 year ago

I mentioned inout to mention that we are using syntax for arguments that are not strictly speaking left-to-right.

I thought about what @jcanizales wrote and decided to gather in one table where we use argument passing styles.

Passing style Func argument decl Func return decl Explicit passing Named return list param
in :white_check_mark: :x: :grey_question: :x:
copy :white_check_mark: :x: :grey_question: :x:
inout :white_check_mark: :x: :grey_question: https://github.com/hsutter/cppfront/pull/198 :x:
out :white_check_mark:, :x: wildcards, unnamed function cannot have out param (alpha limitation) :x: :white_check_mark: :white_check_mark:
move :white_check_mark: :white_check_mark: :white_check_mark: :x: (implicitly move-out)
forward :white_check_mark: :white_check_mark: :white_check_mark: :white_check_mark:

Additionally, there is some common ground for arguments and local variables. Looking from intention, how you want to use the variable:

From that perspective, it has the same meaning for local variables and function arguments, right? It gives me another thought: currently out means that the variable needs to be initialised before it will be used. There is no way to specify that I want only to write to that variable. I think being able to specify read/write/readwrite is important - e.g., some hardware has addresses that you can only read, only write or read and write from. It would be good if the language could specify that as well.

copy, move, and forward are probably only for passing variables to and from functions, correct?

I think the current out should be renamed, e.g., to uninitialized, to emphasize that it has to be initialized before use, which is closer to how it behaves now.

filipsajdak commented 1 year ago

I have watched a Timur Doumler talk, How C++23 changes the way we write code where he mentioned Bjarne Stroustrup (https://youtu.be/QyFVoYcaORg?t=1250):

It's a good feature proposal if it solves at least two unrelated problems simultaneously.

So maybe argument passing can be adjusted to make them useful also for defining local variables?

jcanizales commented 1 year ago

I would like to hear @hsutter 's thoughts on this unification of constexpr, consteval, and constinit before closing the suggestion.

AbhinavK00 commented 1 year ago

I think I need to make some changes to this. Once I come up with a better version of this, i'll reopen this or open another one.