hsutter / cppfront

A personal experimental C++ Syntax 2 -> Syntax 1 compiler
Other
5.43k stars 236 forks source link

[SUGGESTION] Typed Expressions; generalized constructors, UDLs, unnamed variables and functions #463

Closed msadeqhe closed 1 year ago

msadeqhe commented 1 year ago

This is an alternative solution to this issue. If you don't like ...TYPE notation for object construction because of its UDL syntax, this is an alternative solution. In this suggestion, objection construction would use familiar notation EXPR:TYPE (aka Typed Expression) which is similar to how Cpp2 programmers use it within declarations.

Consider EXPR:TYPE as syntactic sugar to (: TYPE = EXPR). For example:

Abc: type;
Xyz: type;
func: (u: int) -> Abc;

a: int = 2;
b: = 2:int;       // (: int = 2)
c: = a:Abc;       // (: Abc = a)
d: = ():Abc;      // Default Constructor
e: = (a + b):Abc; // (: Abc = a + b)
r: = func(a):Abc; // (: Abc = func(a))
s: = func(2):Abc.start():Xyz.value; // Function Chaining
t: = a++:Abc;     // (: Abc = a++)
u: = a:Abc++;     // (: Abc = a)++
x: = 2:int:Abc;   // (: Abc =: int = 2) Constructor Chaining
y: = (2:int + 4:int):Abc; // (: Abc = (: int = 2) + (: int = 4))
z: = ("text", 2:int):Abc; // (: Abc = ("text", (: int = 2)))

Literally if x:int is at the start of a statement or function parameter, it would be a declaration, otherwise it would be a Typed Expression.

Abc: type;

// `x: Abc` is a parameter declaration.
func: (x: Abc) = {}

// `a: int` is a declaration.
a: int = 2;

// `a:Abc` is a Typed Expression, it calls the constructor of `Abc`.
m: = a:Abc;

EXPR:TYPE is similar to ...TYPE suggestion, except with the following advantages:

And this notation has the following disadvantages:

I have to explain : within SOMETHING:TYPE is for object construction (as an expression) or declaration (as a statement), but :: within SOMETHING::TYPE is scope resulotion operator for qualified names. They can be combined like 10++ : my::Type. Also after the object is constructored, we can use operator dot or operator() or operator[] or ... to access members from it, e.g. 10:Type.call() or 10:Type[0].

Will your feature suggestion eliminate X% of security vulnerabilities of a given kind in current C++ code?

No.

Will your feature suggestion automate or eliminate X% of current C++ guidance literature?

Yes.

  1. It unifies constructors with UDLs. They are semantically the same. Both of them create a new object.
    1. It's useful in generic programming.
    2. It reduces concept count.
      • Novice programmers don't need to learn a distinct concept about UDLs.
      • All types benefit from UDL like syntax. It's not needed to declare UDL for them.
      • It eliminates the need of understanding and learning built-in prefixes and suffixes for literals.
    3. The syntax of calling constructors will be expressive and readable.
  2. It distincts constructors from regular function calls. They are semantically different.
    • Constructors:
      • EXPR:TYPE, parentheses are not necessary when EXPR has operators with higher precedence.
      • ():TYPE, it calls the default constructor
      • (args...):TYPE
    • Regular Function Calls:
      • FUNCTION(), it calls a function without arguments
      • FUNCTION(args...)
      • obj.FUNCTION()
      • obj.FUNCTION(args...)
  3. They can be chained together, whereas it's not possible with UDLs in Cpp1.
    • Only one UDL can be applied to a literal in Cpp1.
  4. Constructors already can be templated, but UDLs cannot be templated.
    • UDL templates are not supported in Cpp1.
  5. It removes built-in literal prefixes and suffixes. They are inconsistent and redundant.
    1. They are visually inconsistent.
      • Some of them are prefix.
      • Some of them are suffix.
    2. Their behaviours are inconsistent when the constant of literal exceeds the type as described in this comment.
  6. The name to construct a literal and to declare a variable will be consistently the same.
    • It's not needed to declare a new name for literal suffixes.
    • The name of types are like a suffix that will construct an object.
  7. They can be applied to literals with qualified name (if they are within namespaces) unlike UDLs which need using statement before they can be applied to literals.
    • That's why UDLs in Cpp1 have to be prefixed with _, thus they will be distinguished from UDLs which are declared in the Cpp1 standard library.
  8. Unlike TYPE(args) it doesn't work with UFCS intentionally. UFCS should not work on constructors as described in this comment. Compare:

    // `TYPE(args)` with UFCS on it.
    x: = 10.Type(10, 20);
    
    // `(args):TYPE`
    y: = (10, 10, 20):Type;

Describe alternatives you've considered.

These are alternative solutions:

Thanks.

msadeqhe commented 1 year ago

EXPR:TYPE is similar to ...TYPE suggestion, except with the following advantages:

  • It's familiar and similar to declaration syntax in which types are specified in the language.
  • It's easier to parse, but ...TYPE would complicate the grammar especially for working within function chaining.
  • It doesn't need parentheses for simple expressions with unary postfix operators (e.g. a++:Abc). It depends on operator precedence.

And this notation has the following disadvantages:

  • It requires an extra :. BTW it's opinion based.

Also this suggestion won't add any new syntax to the language, it uses the existing syntax SOMETHING : TYPE, but as expressions. Currently in Cpp2 we use : TYPE = ... expression to create unnamed variables, but it won't conflict with that because unnamed variables don't have the left side of :, and they have an extra =.

msadeqhe commented 1 year ago

In a nutshell:

AbhinavK00 commented 1 year ago

I like the other suggestion better I think. Would this be context-free?

On Sun, 21 May 2023, 17:50 Sadeq, @.***> wrote:

In a nutshell:

  • ID : TYPE would be a declaration if it's at the start of a statement.
    • We use ID : TYPE to specify the type of a declaration.
  • EXPR : TYPE would be a typed expression if it's not at the start of a statement.
    • We use EXPR : TYPE to specify the type of an expression.
  • : TYPE = SOMETHING would be an unnamed variable.

โ€” Reply to this email directly, view it on GitHub https://github.com/hsutter/cppfront/issues/463#issuecomment-1556165945, or unsubscribe https://github.com/notifications/unsubscribe-auth/A2KJHTTIO3C2EI2JIBJYF7LXHICATANCNFSM6AAAAAAYJLD2PA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

msadeqhe commented 1 year ago

Thanks. Yes, It would be context-free, because the behaviour of SOMETHING : TYPE is dependent on its placement:

// It's a declaration.
something: Type;

// It's a typed expression.
call(something: Type);

// It's a parameter declaration.
call: (something: Type) = {}
msadeqhe commented 1 year ago

Briefly:

// `A` is a declaration.
A: Type = 0;

// `B` is a declaration.
// `X` is a typed expression.
B: Type = X: Type;
JohelEGP commented 1 year ago

A typed expression can subsume a UDL better than UFCS.

Let's consider the type std::chrono::year:

date := 1970y/January/1; // UDL after `using namespace std::chrono_literals;`
date := 1970.y()/January/1; // UFCS after `using namespace ufcs_literals;` (see next code block).
date := 1970:y/January/1; // Direct construction after `using namespace typed_expr_literals;` (see next code block).
date := 1970:year/January/1; // This suggestion after `using std::chrono::year;`
date := 1970:std::chrono::year/January/1; // Ugly, but possible.

As you can glean from the comments, a typed expression uses direct construction, whereas UFCS implies an extra object in-between, the function parameter.

Just switch from having a _literals namespace with Cpp1 UDLs to type aliases.

typed_expr_literals: namespace = {
y: type == std::chrono::year; // Direct construction.
}
ufcs_literals: namespace = {
y: (int i) -> _ = std::chrono::year(i); // Indirect construction through `i`.
}

For an alias template (e.g., to replace the UDLs 1s and 1.0s), compiler support is still in the works: 1684676514 1684676524

JohelEGP commented 1 year ago

From our experience at https://github.com/mpusz/units, which has quantity references (e.g., 1 * m for 1 m), one point against some things that try to replace UDL (e.g., UFCS) vs. UDL is that bringing a UDL into scope doesn't take up the symbol in the UDL (because it's actual name is operator""๐˜ด๐˜บ๐˜ฎ๐˜ฃ๐˜ฐ๐˜ญ). So, for example, something like using namespace unit_literals; with variables introduces conflicts, specially when formulas are involved. From Quantity References vs Unit-specific Aliases:

  1. Shadowing issues

    • Quantity References

      References occupy a pool of many short identifiers which sometimes shadow the variables, function arguments, or even template parameters provided by the user or other libraries. This results in warnings being generated by some compilers. The most restrictive here is MSVC which for example emits a warning of shadowing N template parameter for an array size provided in a header file with Newton unit included via namespace declaration in the main() program function (see experimental_angle <https://github.com/mpusz/units/blob/master/example/references/experimental_angle.cpp>). In other cases user is forced to rename its local identifiers to not collide with predefined references (see capacitor_time_curve <https://github.com/mpusz/units/blob/master/example/references/capacitor_time_curve.cpp>).

    • Unit-specific Aliases

      As aliases are defined in terms of types rather variables no major shadowing issues were found so far. In case of identifiers abiguity it was always possible to disambiguate with more namespaces prefixed in front of the alias.

As seen from "Unit-specific Aliases", a typed expression isn't thus affected. But UFCS, which works on functions (and behave like variables), possibly does.

See also UDLs vs Quantity References for many major pain points against UDLs. One that applies to UDL and UFCS, but not typed expression:

JohelEGP commented 1 year ago

I expanded on the quote above. A typed expression isn't actually affected, but UFCS probably is.

Allow me to expand the summary's table:

Feature Aliases References Typed expression UDLs
Literals and variables support Yes Yes Yes Literals only
Preserves user provided representation type Yes Yes Yes No
Explicit control over the representation type Yes No Yes No
Possibility to resolve ambiguity Yes Yes Yes No
Readability Good Medium Good Good
Hard to resolve shadowing issues No Yes No No
Operators precedence issue No Yes No No
Controlled verbosity Yes No Yes No
Easy composition for derived units No Yes Yes No
Simplified quantity casting Yes No Yes No
Implementation and standardization effort Medium Lowest Medium Highest
Compile-time performance Fastest Medium Fastest Slowest

As you you can see, a typed expression is the best in almost all aspects. For details on the "feature", please see the linked documentation.

JohelEGP commented 1 year ago

@mpusz You may be interested in looking at this. In particular, the 3 comments above.

msadeqhe commented 1 year ago

@JohelEGP Thanks for explaining about indirect construction of UDL and UFCS, I wasn't aware of it.

From our experience at https://github.com/mpusz/units, which has quantity references (e.g., 1 * m for 1 m), ...

I changed the example to have typed expressions:

// simple numeric operations
static_assert(10:km / 2 == 5:km);

// unit conversions
static_assert(1:h == 3'600:s);
static_assert(1:km + 1:m == 1'001:m);

_s: = 1:s;
kmph:type == decltype(1:km / _s);

// dimension conversions
static_assert(1:km / 1:s == 1'000:m / _s);
static_assert(2:kmph * 2:h == 4:km);
static_assert(2:km / 2:kmph == 1:h);
static_assert(2:m * 3:m == 6:m2);
static_assert(10:km / 5:km == 2);
static_assert(1'000 / 1:s == 1:kHz);

For this to work, I think that unit types may have an extra template parameter to indicate the prefix. For example metre<1'000> is equal to kilometre.

JohelEGP commented 1 year ago

Don't worry. The library has taken care of all that. In fact, it already has those alias templates! So if typed expressions made it to Cpp2, that example would compile right away (not on Clang yet).

JohelEGP commented 1 year ago

From commit 0982b8ec41f7ed880d748269c7732d56228a4a19:

Note that : continues to be pronounces "is a"... e.g., f: () -> int is pronounced as "f is a function returning int," v: vector<int> as "v is a vector\", this: Shape as "this object is a Shape."

That works well for named declarations. What about expressions with :?

Commit 1090a31a39536066d9c01b6dc69bf7c598ff79a2 also enabled : std::vector = (5,1). Let's consider it together with 42:seconds.

: std::vector = (5,1) can be pronounced as "the vector $(5, 1)$", and 42:seconds and 42:s can be pronounced as "42 seconds".

msadeqhe commented 1 year ago

Good point. So it would be like to pronounce:

In general:

msadeqhe commented 1 year ago

Assignment to Typed Expression:

// `A` is a declaration.
A: Type = 0;

// `B` is a declaration.
// `A` is a typed expression.
B: Type = A: Type;

// It's equal to:
//      = Type::operator=(out this, B).operator=(something)
C: Type = B: Type = something;
//  (2 + 2): Type = something;
// x++*.f(): Type = something;

// It's equal to:
//      = Type::operator=(out this, something)
D: Type = : Type = something;

It's can be safe to disallow assignment in case 3, because it's rvalue:

// ERROR `B: Type` is rvalue.
C: Type = B: Type = something;

I'm going to categorize them.

They all have a similar syntax but semantically they are different in this way:

  1. Variable Declarations They are statements.
    • something: Type;
    • something: Type = value;
  2. Unnamed Variables They are expressions.
    • (: Type = value)
  3. Typed Expressions They are expressions.
    • (something: Type)
    • (something: Type /*unary postfix operators*/)
    • For example:
      • (something: Type = value)
      • (something: Type++)
      • (something: Type.member...)

Unnamed Variables are a special Typed Expression.

So we can think about it that Unnamed Variables are a special Typed Expression. This is a generalized syntax for both of them:

(something: Type = value)

Unnamed Variables don't have the something part, therefore the variable won't be initialized with something, instead it will be initialized with value, because value is the first assignment, and the first assignment is initialization.

After that, they would be categorized in this way:

  1. Variable Declarations They are statements.
    • something: Type;
    • something: Type = value;
  2. Typed Expressions They are expressions.
    • (something: Type)
    • (something: Type /*unary postfix operators*/)
    • For example:
      • (something: Type = value)
      • (something: Type++)
      • (something: Type.member...)
    • (: Type = value) aka Unnamed Variables
    • (: Type) is invalid, because it's an uninitialized variable which is immediately used.

This categorization will reduce concept count.

msadeqhe commented 1 year ago

I'm trying to find a general rule to reduce concept count.

Also Unnamed Functions could be somehow a special Typed Expression if Cpp2 would support issue suggestion #391 titled "Statement-expressions, result vs return". In general it would be possible to have a block statement after assignment operator. Let's look at the syntax of typed expression with function types and assignment in this case:

(something: (args) -> Type = { /*statements*/ })

Unnamed Functions don't need the something part, because the something part is for passing arguments to the function (see next comment), because the function body cannot be before parameter declarations. If we don't write something part, it would be a function object, otherwise it would immediately call the function. So assignment is needed to define function body.

After that, they would be categorized in this way:

  1. Variable/Function Declarations They are statements.
    • something: Type;
    • something: Type = value;
    • (NEW) something: Type = { /*statements*/ }
  2. Typed Expressions They are expressions.
    • (something: Type)
    • (something: Type /*unary postfix operators*/)
    • For example:
      • (something: Type = value)
      • (NEW) (something: Type = { /*statements*/ })
      • (something: Type++)
      • (something: Type.member...)
    • (: Type = value) aka Unnamed Variables
    • (NEW) (: (args) -> Type = { /*statements*/ } aka Unnamed Functions
    • (: Type) is invalid, because it's an uninitialized variable which is immediately used.

I have to clarify about the syntax (described above) of typed expressions:

So this example won't be allowed:

// WRONG! This typed expression applied to a statement block.
{ /*statements*/ }:(args) -> Type
msadeqhe commented 1 year ago

I have to clarify about the syntax (described above) of typed expressions if Cpp2 could support issue #391:

  • The type is not restricted. It can be either:
    • a variable type (e.g. Type).
    • or a function type (e.g. (args) -> Type).
  • But something cannot be a statement (e.g. { /*statements*/ }).
    • Because Typed Expressions can only be applied to expressions.
  • If the type of typed expression is a function type:
    • If it has something, the function would be immediately called.
    • It must have an assignment.

If they have something, in this case the function would be called immediately with arguments:

(something: (args) -> Type = { /*statements*/ })

For example:

// It immediately calls the function with arg=1.
1: (arg: int) -> int = { return arg + 2; }

// It immediately calls the function with a=1, b=2.
(1, 2): (a: int, b: int) -> int = { return a + b; }
JohelEGP commented 1 year ago
// This function has not a definition.
func: (args) -> Type;

// The first assignment is definition.
func = { /*statements*/ }

I don't think this makes sense.

At a namespace-scope declaration:

At a function-scope declaration: It seems like Herb intends to support local functions somehow: https://cpp2.godbolt.org/z/59Yqb3xvz. Actually, not yet: https://github.com/hsutter/cppfront/issues/386#issuecomment-1516942691.

msadeqhe commented 1 year ago

Yes, you're right. I will correct this paragraph from my comment.

Unnamed Functions don't have the something part, therefore the function won't be defined with something, instead it will be defined with { /*statements*/ }, because { /*statements*/ } is the first assignment, and the first assignment is definition:

// This function has not a definition.
func: (args) -> Type;

// The first assignment is definition.
func = { /*statements*/ }

Because the something part is for passing arguments to the function.

Edit

Thanks @JohelEGP I've removed that misleading information from my comment.

msadeqhe commented 1 year ago

If they have something, in this case the function would be called immediately with arguments:

(something: (args) -> Type = { /*statements*/ })

For example:

// It immediately calls the function with arg=1.
1: (arg: int) -> int = { return arg + 2; }

// It immediately calls the function with a=1, b=2.
(1, 2): (a: int, b: int) -> int = { return a + b; }

I gave up on this idea. So unnamed functions shouldn't be immediately called in this way, because it's inconsistence with how unnamed variables work.

mpusz commented 1 year ago

@JohelEGP you provided "Yes" in the table for "Easy composition for derived units". How it is possible with typed expressions?

JohelEGP commented 1 year ago

As a replacement to UDLs, typed expressions build on type aliases/alias templates, so they should be the same. At the time, I probably thought that the alias used for example in the "Composition for unnamed derived units" bullet could be defined using decltype and expressions.

mpusz commented 1 year ago

Easy composition for derived units and quantities does not mean that you can decltype some result to define the type, but the fact that you do not have to define it at all. We do not want to end up with hundreds of different variations of types for units of a single derived quantity. For example, consider how many predefined types for units of angular momentum besides kilogram_metre_sq_per_second would be needed to make everyone happy. That is not easy to compose (and standardize) at all.

JohelEGP commented 1 year ago

I understand. V2 makes that row redundant, right? It has no unit downcasting, and unit composition is transformed from kilogram_metre_sq_per_second to ~derived_unit<square<kilogram, metre>, per<second>>, and from kilometre_per_hour to ~derived_unit<kilo<metre>, per<hour>>. So the bullet under the title "Composition for unnamed derived units", which for "Quantity References" says "References have only to be defined for named units." would be true for aliases, too.

msadeqhe commented 1 year ago

Somehow if multiplication of units could be modeled as template template parameters, we would have:

// Type aliases
kg: <T> type == com<kg_type, T>;
m2: <T> type == com<m2_type, T>;

// 10:kg:m2 is com<m2_type, com<kg_type, int>>
a: = 10:kg:m2;

But there isn't any notation better than operator/ for division:

1:N:m == 1:kg:m2 / 1:s2

1:J / 1:mol:K == 1:m2:kg / 1:s2:mol:K
mpusz commented 1 year ago

I do not think V2 makes it redundant. The V2 provides a solution that gathers the best features from all the options we had before. In V2 we have units that "have only to be defined for named units", and the "unnamed" derived units are obtained by applying unit equations on the predefined ones. In V2 user never types derived_unit<kilogram, square<metre>, per<second>> but does kg * m2 / s (or si::kilogram * square<si::metre> / si::second) to get it. You can put those easily to a quantity type as well quantity<kg * m2 / s>. That is the power of composition where you have to predefine only a few named units to be able to obtain "infinite" number of derived unnamed ones.

Unit-specific alias in V1 are pointing to quantity types rather than units so we can't obtain derived unit or quantity_spec type by equations. I think this is also true for typed equations.

JohelEGP commented 1 year ago

1:N:m

It doesn't seem possible for that, or the equivalent m(N(1)), to mean 1 Newton metre.

msadeqhe commented 1 year ago

What if its type is:

m: == Comp<m_type, T>;
N: == Comp<N_type, T>;

1:N:m is Comp<m_type, Comp<N_type, int>>
1:N:m:kg is Comp<kg_type, Comp<m_type, Comp<N_type, int>>>

And Comp is the underlying type of all units. Each derived unit is a composition of two units, but each base unit is a composition of itself and int.

JohelEGP commented 1 year ago

Sorry, I was too brief in my reply.

That certainly works. But it would be another library altogether.

One of the points of mp-units is

  1. The best possible user experience

    • compiler errors
    • debugging

-- https://mpusz.github.io/units/introduction.html#approach

The nesting required to make this work is suboptimal. Another thing is that the types of 1:N:m and 1:m:N would be different just due to the placement of the units.

I tried to make it work without disrupting the design. What I found out is that it doesn't seem possible for m(N(0)) to result in a type like quantity<derived_unit<metre, Newton>, int>: https://cpp2.godbolt.org/z/YbEh4aPj9.

JohelEGP commented 1 year ago

Another point against alias chaining is the extra construction per type. For example, 0:uโ‚™:uโ‚™โ‚‹โ‚:โ€ฆ:uโ‚€ and uโ‚€(โ€ฆ(uโ‚™โ‚‹โ‚(uโ‚™(0)))โ€ฆ) perform $n-1$ extra quantity constructions vs. 0 * (uโ‚™ * uโ‚™โ‚‹โ‚ * โ€ฆ * uโ‚€), which computes the final quantity from the rhs once and the outer * performs the quantity construction once (plus a parameter construction).

I have to say that the readability and composability of your example is superb:

1:N:m == 1:kg:m2 / 1:s2

1:J / 1:mol:K == 1:m2:kg / 1:s2:mol:K
JohelEGP commented 1 year ago

A typed expression can subsume a UDL better than UFCS.

This still stands.

Here are some examples of chained typed expressions that work well from #284:

20'percent'bottle'water
5'000'gram'apple
1'h'worker

1:N:m (1 Newton metre)

It's unfortunate that to make chaining work for units one has to add an explicit constructor that doesn't actually make sense by itself. What does it mean to construct a quantity of metres from Newtons?

I've left the table of https://github.com/hsutter/cppfront/issues/463#issuecomment-1556225879 untouched, despite typed expressions as a substitute for unit UDLs building on aliases.

I'm thinking that rather than aliasing the existing class template quantity, the aliases intended to be used in a typed expression could be their own entity. Then we have a clean slate to workaround whatever issues we can and integrate them better into the existing design.

Here's my attempt: https://cpp2.godbolt.org/z/3ddev6MMe https://cpp2.godbolt.org/z/abMWoj6eG.

msadeqhe commented 1 year ago

<> for grouping types

Currently we use parentheses to group expressions: (1 + 2) * 3. I suggest to use angle brackets <> to group types and make it a syntax sugar to decltype in this way:

2:<N*m> == 2:decltype(2:N * 2:m)
2:<int> == 2:decltype(2:int)

It would make type composition easy. For example:

2:<N*m> == 2:<kg*m2/s2>
1:<J/mol/K> == 1:<m2*kg/s2/mol/K>

Also nested grouping with <> is possible (just like () within expressions):

1:<J/<mol*K>> == 1:<m2*kg/<s2*mol*K>>

It doesn't conflict with <> for template parameters, because it only works for typed expressions. In a similar manner that parentheses are for function parameters in declarations, but they mean grouping within expressions.

If there is an identifier before <>, it would be a template type with template arguments:

2:<Type<int>*int> == 2:decltype(2:Type<int> * 2:int)

That is just like how if there is an identifier before (), it would be a function call with function arguments.

msadeqhe commented 1 year ago

Additionally to use variable or function names within <>, we may use decltype within <> like this:

a: Type = 0;

2:<decltype(a)*int> == 2:decltype(2:decltype(a) * 2:int)

Unary operators and other combinations are possible, but we use <> for grouping instead of (). For example:

2:<<Abc + Xyz>++ * <Abc + Xyz>++> == 2:decltype((2:Abc + 2:Xyz)++ * (2:Abc + 2:Xyz)++)
2:<Abc < Xyz> == 2:decltype(2:Abc < 2:Xyz)
2:<Abc > Xyz> == 2:decltype(2:Abc > 2:Xyz)

< and > within <> will be parsed similar to how they work within template arguments.

mpusz commented 1 year ago
2:<N*m> == 2:decltype(2:N * 2:m)

This does not work in a generic sense. Even though it is perfectly fine for int as a representation type, it will not work for a linear algebra vector type as multiplying those is generally undefined. You either have a dot or vector product, but both end up with a different type than the inputs.

Probably you mean something like:

2:<N*m> == 2:decltype(N * m)

which may work.

msadeqhe commented 1 year ago

Thanks. Yes I mean that. Infact I was thinking about allowing unnamed uninitialized variables within decltype:

// `:N` and `:m` are unnamed uninitialized variables.
2:<N*m> == 2:decltype(:N * :m)

: is not an operator like * or / therefore it doesn't mean a mathematical operation in this case, expr:Type will create an instance of Type from expr.

msadeqhe commented 1 year ago

<> for grouping types

Currently we use parentheses to group expressions: (1 + 2) * 3. I suggest to use angle brackets <> to group types and make it a syntax sugar to decltype in this way:

2:<N*m> == 2:decltype(2:N * 2:m)
2:<int> == 2:decltype(2:int)

Also it's syntactically possible to use () instead of <> without any conflict, that's based on the rule that typed expressions cannot be a function type (except if they are unnamed functions, hence the left-side of : doesn't exist), becuase function body must be after its signature. So we would have:

a: = 120:A;

// `(A)` is not a function signature, because it's a typed expression.
a: = 120:(A);

// But `(A)` is a function signature, because it's a declaration.
a: (A) = ...

Examples:

2:int == 2:(int)

2:(N*m) == 2:(kg*m2/s2)
1:(J/mol/K) == 1:(m2*kg/s2/mol/K)

1:(J/(mol*K)) == 1:(m2*kg/(s2*mol*K))

a: Type = 0;

2:(decltype(a)*int)

2:((Abc + Xyz)++ * (Abc + Xyz)++)
2:(Abc < Xyz)
2:(Abc > Xyz)

Although () is more readable than <>, but<> always means the same in contrast to ().

JohelEGP commented 1 year ago

<> for grouping types

[...]

It would make type composition easy. For example:

2:<N*m> == 2:<kg*m2/s2>

That certainly works in favor of unit libraries. But I worry about the feature not being more generally useful. Can we think of more use cases?

It can also work for C++ standard library range piping when the pipes don't have input:

algo: <R> (r: R) requires std::range<R> = {
  rng: namespace == std::ranges;
  return r:<rng::filter_view|rng::join_view>;
}

Of course, the standard syntax r | filter | join is more general.

Also it's syntactically possible to use () instead of <> without any conflict

I was going to suggest that for the inner <>s, e.g., 1:<J/<mol*K>> -> 1:<J/(mol*K)>. Because identifiers within the <> after the colon in a typed expressions are already types. Parentheses would also work after the colon, but I worry we might be overloading them too much in Cpp2.

Thanks. Yes I mean that. Infact I was thinking about allowing unnamed uninitialized variables within decltype:

// `:N` and `:m` are unnamed uninitialized variables.
2:<N*m> == 2:decltype(:N * :m)

That'd be a good shorthand for

decltype(std::declval<decltype(2:N)>() * std::declval<decltype(2:m)>())
msadeqhe commented 1 year ago

You're right, () is already working in expressions and that is reasonable to use it for inner <>s.

To have general use case, it seems <> can be used within declarations without any conflict with template parameters:

// When there is one <>, it's for type composition.
variable1: <A*B++> = /*expression*/;

// That's because the following is already an error in Cpp2 if `T` is a template parameter:
// ERROR! `T` is not a declared type! Also `T` cannot be a template parameter.
variable2: <T> = /*expression*/;

// Instead, it has to be declared like the following (already works):
variable3: <T> T = /*expression*/;

// When there is two <>, always:
// - The first <> is for template parameters.
// - The second <> is for type composition.
variable4: <T> <A*B/T> = /*expression*/;

// The <> before () is always for template parameters.
function1: <T> (a: <A*T>, b: <B*T>) -> <A*B> = { /*statements*/ }

// OK: The type of template paramteter `v` is `<A*B>`.
function2: <v: <A*B>> () = { /*statements*/ }

// OK: The type of template paramteter `v` is template parameter `std::vector<T>`.
function3: <v: <T> std::vector<T>> () = { /*statements*/ }

So <> would somehow complement decltype:

function1: (a: A, b: B) -> decltype(a*b) = { /*statements*/ }

function2: (a: A, b: B) -> <A*B> = { /*statements*/ }

function3: (a: A, b: B) -> <A*decltype(b)> = { /*statements*/ }

Also Cpp2 can go furthur and change the name of decltype to simply type. It would lead to less count of keywords in the language:

function1: (a: A, b: B) -> type(a*b) = { /*statements*/ }

function2: (a: A, b: B) -> <A*type(b)> = { /*statements*/ }

a: A = ();
variable1: type(a) = /*expression*/;

In this way type is for declaring a type, but type(...) is a way to get the type of an expression exactly like decltype. The point is that type is already a keyword in Cpp2.

msadeqhe commented 1 year ago

Briefly it would mean:

a: = ... // It's a variable or function object. It depends on the right hand side of assignment.
b: A = ... // It's a variable.
c: <A> = ... // It's a variable. <A> is a composed type. Currently it's an error in Cpp2.
d: <T> T = ... // It's a variable template.
e: <T> <T> = ... // It's a variable template. The second <T> is a composed type.
f: <T> type(expr) = ... // It's a variable template. `type` is `decltype` here.
g: type(expr) = ... // It's a variable. `type` is `decltype` here.
h: type = ... // It's a type.
i: <T> type = ... // It's a type template.
j: <T> (args...) = ... // It's a function template. The return type is `void`.
k: <T> (args...) -> T = ... // It's a function template.
l: <T> (args...) -> <T> = ... // It's a function template. The return type <T> is a composed type.
m: <T> (args...) -> type(expr) = ... // It's a function template. `type` is `decltype` here.

type means "Type" (itself), but type(expr) means "Type of expression".

In general:

// <A> is a composed type.
n: <A> = ...

// <T> is a template parameter.
// `something` can be either a type, composed type, function type, `type` or `namespace`.
o: <T> something = ... // `o` is a template.

// `something` can be either a type, composed type, function type, `type` or `namespace`.
p: something = ... // `p` is not a template.
JohelEGP commented 1 year ago

Also Cpp2 can go furthur and change the name of decltype to simply type. It would lead to less count of keywords in the language:

C23 got typeof and typeof_unqual. IIUC, they'll be C++ when it's rebased on C23.

AbhinavK00 commented 1 year ago

typeof is not decltype though. It's like decltype but with references removed. So,

typeof :== std::remove_reference_t<decltype(T)>;

decltype would still be used and therefore renaming it could be considered.

msadeqhe commented 1 year ago

I examined () for type composition within declarations, it seems () is not needed at all within declarations.

<> for grouping types

Currently we use parentheses to group expressions: (1 + 2) * 3. I suggest to use angle brackets <> to group types and make it a syntax sugar to decltype in this way:

2:<N*m> == 2:decltype(2:N * 2:m)
2:<int> == 2:decltype(2:int)

Also it's syntactically possible to use () instead of <> without any conflict, that's based on the rule that typed expressions cannot be a function type (except if they are unnamed functions, hence the left-side of : doesn't exist), becuase function body must be after its signature. So we would have:

a: = 120:A;

// `(A)` is not a function signature, because it's a typed expression.
a: = 120:(A);

// But `(A)` is a function signature, because it's a declaration.
a: (A) = ...

Examples:

2:int == 2:(int)

2:(N*m) == 2:(kg*m2/s2)
1:(J/mol/K) == 1:(m2*kg/s2/mol/K)

1:(J/(mol*K)) == 1:(m2*kg/(s2*mol*K))

a: Type = 0;

2:(decltype(a)*int)

2:((Abc + Xyz)++ * (Abc + Xyz)++)
2:(Abc < Xyz)
2:(Abc > Xyz)

Although () is more readable than <>, but<> always means the same in contrast to ().

So type composition within declarations would be like this:

A: type = { /*declarations*/ }
B: type = { /*declarations*/ }
function: <T> (a: A*T, b: B*T) -> A*B = { /*statements*/ }

X: type = { /*declarations*/ }
variable: <T> X*T = /*value*/;

That's because , has lowest precedence and it's not an operator, it's just a separator. So () should not be used for type composition within declarations.

Briefly, it means that:

msadeqhe commented 1 year ago

C23 got typeof and typeof_unqual. IIUC, they'll be C++ when it's rebased on C23.

That's interesting. Also C23 has auto.

decltype would still be used and therefore renaming it could be considered.

I like how Cpp2 already has type keyword, and type(...) could be used to mean decltype(...).

realgdman commented 1 year ago

1:N:m (1 Newton metre) What does it mean to construct a quantity of metres from Newtons?

I agree that's not intuitive from human-language point too. In cpp2 : generally means is-a, like "f is-a function", and "Newton is-a metre" makes little sense.

JohelEGP commented 1 year ago

1:<N*m> solves that and the $n-1$ constructions of 1:N:m, which translates to m(N(1)).

msadeqhe commented 1 year ago

Yet Another Alternative Solution

What if first we try to fix arg.Type(other_args...) for object construction? I've written the problems of UFCS on types in this comment. If we use (all_args...).Type or expr.Type instead of arg.Type(other_args...), it would be like this:

Abc: type = { /*declarations*/ }

x1: = 1.Abc;   // this suggestion
x2: = 1.Abc(); // in current Cpp2

y1: = (1, 2).Abc; // this suggestion
y2: = (1).Abc(2); // in current Cpp2

z1: = (1, 2, 3).Abc; // this suggestion
z2: = (1).Abc(2, 3); // in current Cpp2

So (all_args...).Type doesn't have the problems of UFCS on types:

Using . instead of : has the advantage of using () with type composition, because it wouldn't be ambiguous with function types, by the way still <> can be used instead of ():

10.(N*m) == 10 * 1.kg * 1.m2 / 1.s2

// <> can be used instead of ()
10.<N*m> == 10 * 1.kg * 1.m2 / 1.s2

But this alternative solution has a problem (opinion-based). The problem is that if something is a type, 1.something would call the constructor function, but it doesn't have () like other function calls.

msadeqhe commented 1 year ago

Now, if we use (all_args...).Type to create an instance of Type with arguments (all_args...), and if we use parentheses around composed types within typed expressions, and if we don't use anything around composed types within declarations, this is how the code would look like:

// Type composition within declarations:
A: type = { /*statements*/ }
B: type = { /*statements*/ }
function: <T> (a: A*T, b: B*T) -> A*B = { /*statements*/ }

point: type = {
    operator=: (out this, a: A, b: B) = { /*statements*/ }
}

main: () = {
    a: = ().A; // It calls the default constructor.
    b: = 12.B; // It calls the constructor `operator=(out this, 12)`.
    c: = 12.int.B; // Also they can be chained.

    // Type composition within typed expressions:
    x: = 10.(N*m) == 10 * 1.kg * 1.m2 / 1.s2;
    y: = 10.(N*m) == 10.(kg*m2/s2);

    z: = (a, b).point;
}
JohelEGP commented 1 year ago

I don't think that's context-free:

Cpp2 strictly avoids this, and never requires sema to guide parsing. When I say "context-free parsing" this is the primary thing I have in mind... the compiler (and the human) can always parse the current piece of code without getting non-local information from elsewhere. -- Extract from https://github.com/hsutter/cppfront/wiki/Design-note%3A-Unambiguous-parsing.

x.b is already valid today and means member access. We'd need to know whether b is a type to know what x.b means.

Even with parentheses, (x).b and (0,1,x).b are valid today.

msadeqhe commented 1 year ago

Yes, it's not context-free, because it's based on UFCS a.Something(args) for types in which Something may be a type or a function. Whereas that syntax is changed to (a, args).Something in which Something may be a type or a variable.

The motivation of this alternative suggestion is that if UFCS a.Something(args) for types is currently acceptable in Cpp2 as a context-free language, (a, args).Something would be acceptable too.

JohelEGP commented 1 year ago

They are not so comparable. The function call func(arg) and construction type(arg) are very similar, even in Cpp1. The member access arg.something and construction something(arg) are way more dissimilar.

msadeqhe commented 1 year ago

That's right but something(args) itself is not context-free, of course a.something(args) is not context-free, either. something can be either type or function.

I've to clarify I don't want to say that 1.stuff is better than 1:stuff or 1stuff, it's another alternative suggestion to consider.