c3lang / c3c

Compiler for the C3 language
https://c3-lang.org
GNU Lesser General Public License v3.0
2.98k stars 183 forks source link

:bulb: Tuples #783

Closed linkdd closed 1 year ago

linkdd commented 1 year ago

This is just an idea (see this reddit thread).

Tuples are useful and are, in my opinion, missing from C.

Proposed syntax

Tuples are a product type, just like structures, but where the fields are not named. So the syntax could simply be:

typedef struct { int; char; } mytuple;

mytuple foo = { 42, 'c' };
int bar = foo.0;
char baz = foo.1;

It could be desugared to:

typedef struct { int field_0; char field_1; } mytuple;

mytuple foo = { .field_0 = 42, .field_1 = 'c' };
int bar = foo.field_0;
char baz = foo.field_1;

Semantics

Like structs, they should have value semantics.

But unlike C structs, you should be able to use them as anonymous structs in parameter lists.

Consider this C example: https://ideone.com/1WlwY7

Here the compiler complains that the anonymous struct in the foo() parameter list is not visible outside the function. Also, 2 anonymous structs with the same layout are not the same types.

But I would expect/like this to work:

void foo(struct { int; char; } x) {
  // ...
}

int main(void) {
  foo({ 42, 'c'});

  struct { int; char; } x = { 42, 'c' };
  foo(x);

  return 0;
}

This might require you to change the semantics around anonymous structs and visibility.

lerno commented 1 year ago

You can have a look at #269 for some related thoughts.

lerno commented 1 year ago

There are two concerns really:

  1. Usefulness: I've worked in imperative languages with tuple returns in a code base over about 10 years time. I used it 2 times and each time it saved 2 lines of code.
  2. How much extra syntax is needed for the feature. This is discussed in #269. But { int x; int y; } = abc(); is not easily LL(1), (int x, int y) = abc(); is similarly ambiguous. struct { int x; int y; } = abc(); is weird etc.

(Presumably we also want { x; y; } = abc(); to work with already defined variables. And so obviously { int x; } = abc() is somewhat ambiguous (x) = abc(); is even worse etc.

linkdd commented 1 year ago
  1. It's easier to implement sum types as a library when you have tuples, and bring a bit of that functional programming stuff into a C-like language.
  2. You could introduce a keyword tuple to avoid ambiguities. I initially proposed struct to avoid this, and re-use what already exists.

As for destructuring, is there a specific reason to not imitate the C++ syntax: https://en.cppreference.com/w/cpp/language/structured_binding ?

I guess this syntax relies heavily on their semantics around auto. But fun fact, C23 will also use auto for type-deduction, see 6.7.9 Type Inference of the draft.

lerno commented 1 year ago

Hmm.. I think

[int a, int b] = foo();
[a, b] = foo();
[Abc a] = foo();

Might all be unambiguous. But that's for tuple destructuring and could just as well have been designed for multiple returns.

As you might see in #269, the idea was for anonymous structs to structurally convert to any other struct or in fact any other type that was equivalent (for instance int[2] and struct { int a; int b; } are structurally equivalent)

However, the somewhat clumsy way to describe the structs inline, made me question whether something like

void foo(struct { int; char; } x) {
  // ...
}

Was actually that useful, or if it was better described as:

struct Foo { int a; int b; }

void foo(Foo x @structural) {
   // ...
}
struct Bar { int g; int j; }

 ...
 Bar b;
 foo(b); // Ok due to @structural
lerno commented 1 year ago

Incidentally this works today:

struct Foo { int a; int b; }

void foo(Foo x) {
   // ...
}
...
foo({ 3, 4 });

This is because there is extensive inference for structs.

linkdd commented 1 year ago

But that's for tuple destructuring

The destructuring syntax of C++ also work for structs, the order of field declaration is used.

int[2] and struct { int a; int b; } are structurally equivalent

Actually, I don't know for C3 but in C, the compiler might add padding between the fields of a struct, so it's not that straightforward.


On another note, tuples might shine with your generic modules.

I find the fault types to make control flow quite hard to follow, I'm more a fan of Option<T> / Result<T, E> sum types, as we do in Erlang/Elixir/Rust/most ML languages.

Tuples are also a form of heterogeneous list, only "immutable" (you cannot append to it). Which are useful for dealing with CSV data (which don't necessarily have a header line for field names).

Those are a few use case examples, but I would totally get if you decide that tuples are not a good fit for C3 🙂

lerno commented 1 year ago

I am open to the possibility that I overlooked something and I was in particular interested in hearing how you imagined userland sum types and tuples were connected.

In regards to error handling, note that C3 essentially uses a built in Result type (with some special semantics)

lerno commented 1 year ago

It seems as if the need for tuples are very limited in C3 with the other ways to emulate them. There is also a generic Tuple and Triple type in the standard library now.