ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
https://ziglang.org
MIT License
35.21k stars 2.57k forks source link

Proposal: 'typedef' for zig #5132

Closed ghost closed 3 months ago

ghost commented 4 years ago

There is a related proposal for distinct types, but it doesn't explore all the opportunities that a more generic form of distinct types could provide. This proposal is an attempt to do just that.

It is also an attempt at finding a new core language abstraction in zig which hopefully is both simple (conceptually) and expressive enough to be worthwhile.

'typedef' in zig could become a common platform for any additional or improved feature that is "comptime only" in nature. This in contrast to implementing each such comptime feature in separation and from scratch.

Being a "comptime only" feature, typedef wouldn't add much (if any) complexity to the codegen stage, and rather require changes to the earlier compilation stages instead.

That there is a similar (though more limited) typedef feature in C will hopefully help make typedef in zig discoverable for newcomers.

Overall, I'd say this proposal is a relatively logical and natural addition to zig, as there are no "new" concepts introduced. It's rather a merger of existing ones:


Use cases:

Can be implemented on top of typedef:

To avoid making this text even longer than it is, I will mostly refer back from comments in other issues to cover the use cases that way, instead of writing in detail about all the use cases here. Other times, a feature might map so trivially to typedef that it is natural to include it in the summary section below.

The focus of this post is instead on the principles and framework that the implementations for those use cases would be based on.


Usage perspective:

Introducing the typedef syntax:

const MyInt = typedef(u32, .Distinct);        // restricted coercion
const MyInt2 = typedef(u32, .Alias) {         // unrestricted coercion
  fn increment(self: MyInt2) MyInt2 {return self+1;}    // const x: MyInt2 = 10; const y = x.increment(); assert(y==11);
}
const TaggedFloat = typedef(f64, .Tag{.txt="percentage"});   // reflection with std.....hasTag()

Then a more elaborate (but very generic) example that is explained in detail below:

// .Tdconf = "Typedef config" (instance of zig built-in tagged union)
const Td = typedef(T, .Tdconf) {
  const FOO = 100;
  fn method(self: T) bool{
     // ...
  }
};

test "typedef" {
  const t : T = createInstanceOf(T);
  const td : Td = @as(Td, t)
}
// T, t, Td, td, and .Tdconf will frequently be referenced below:

Read the above as:

The different typedef configs allow for user specified comptime behavior regarding coercion, unified call syntax, member access, available syntax, restricted syntax, custom validation, etc.

Creating a standardized way to get more control over these compile time aspects is what allows other features to be built on top of typedef.

One typedef config will usually only affect a small subset of comptime behavior at a time. E.g typedef(u32, .Distinct) would be orthogonal to typedef(MyStruct, .Tag{.txt="mytag"}) with regards to which compilation behavior they would parametrize or modify, even though they are both ultimately based on typedef.

Continuing the list:

And finally:


Implementation perspective:

// current syntax
const T1   = u32;
const T2   = SomeStruct;
const T3   = []const u8;
const A_T1 = T1;

// typedef syntax
const TD_T1 = typedef(T1, .Distinct) {};    // keyword takes type, config value, and optional scope
const TD_T2 = typedef(T2, .Distinct);       // typedef can wrap/copy all zig types.
const TD_T3 = typedef(T3, .Distinct) {};    // .. primitives or not
const TD_TD_T1 = typedef(TD_D1, .Distinct); // typedef may be nested
const TD_R = typedef(u32, .ResourceHandle); // config can be something else than .Distinct

Representation of typedef:

Let a typedef "type" be represented by the shorthand TD(T,N,B,S,C).

Where TD(T,0,0,0,0) is "reserved" as being equivalent to the plain zig type T

T is always a plain zig type, whether the typedef declaration "wrapped" a normal zig type or another typedef.

N is a unique (nonzero) ID that is assigned per declaration of a typedef. N may be zero only when the shorthand represents a plain zig type.

B is included to enable nesting of typedefs, with B equal to the id (N) of the base typedef, or with B = 0 if the base is a plain zig type.

S is an ID or pointer corresponding to the attached namespace of a typedef declaration. S == 0 (null) if there was no namespace attached.

C stores an ID or pointer corresponding to a typedef config (created per declaration, but there exists only one instance per unique config) where C == 0 (null) only when the shorthand represents a plain zig type, not a typedef.

Example:

// NOTE: refer to the context above
// ID field N set to zero for plain zig types/type aliases
TdTypeOf(T1)   -> TD(u32,0,0,0,0)
TdTypeOf(T2)   -> TD(SomeStruct,0,0,0,0)
TdTypeOf(A_T1) -> TD(SomeStruct,0,0,0,0)

// c1, c2 != 0, and c1 != c2
TdTypeOf(TD_T2)   -> TD(SomeStruct,n,0,s,c1) // n != 0, and s != 0 (has attached scope)
TdTypeOf(TD_D_T1) -> TD(u32,m,n,0,c1)        // m != 0 and m != n (typedef nesting)
TdTypeOf(TD_R)    -> TD(u32,l,0,0,c2)        // l != 0

Zig compiler and typedef:

I don't know as much about the zig compiler as others here, so I have to go by assumptions and hope they are valid or at least not completely off.

My assumptions are that:


Another look at typedef declarations
const Td = typedef(T, .Tdconf) {
  const FOO = 100;
};

const TdTd = typedef(Td, .Tdconf);

test ".." {
    const t : T = T{};

    const myTd1 : Td = t; // will this work?
    const myTd2 = @as(Td,t);

    const myTdTd = @as(TdTd, myTd1);

    const expr = 100 + Td.FOO;
}

The following are the results from the typedef declarations and casts above, assuming T is a base type:

Casting and coercing between typedefs essentially involve changing the typedef id associated with a ZigValue according to rules that are parametrized by the typedef configs.

Handling of assignments, typedef perspective only. t is a ZigValue of type T, with typedef id equal to 0:


Mapping typedef to use cases:

The capabilities of typedefs come in form of

As specified before, all typedef configs are members of a built-in union. Below is a subset of possible typedef configs that would solve some existing github issues, while spanning enough of the "typedef framework" to get decent initial test coverage of it.

Only if these four typedef configs are possible to include in an architecturally sound manner should additional features (more typedef configs) be considered.

Core typedef configs:

Additional typedef configs: Extensive list of possible or theoretically viable typedef configs, where not all of them might be worth it or possible to include.


1: Attached namespace and "syntax views" of a type

Currently, primitive types cannot have methods, even though the only difference between methods and functions are the syntax used for calling them. This becomes an artificial restriction on primitive types.

Also, you cannot add methods to 3rd party user defined types without modifying the 3rd party source code.

With typedef, given a type T you want to define a method on, where T happens to be either a primitive or a 3rd party struct type: Simply cast or coerce T to a typedef Td wrapping T, where the attached scope of Td contains the method.

Typedef provides a form of "method extensions" (#1170), but without introducing any confusion to where a method is defined. It's defined either in the base type, or in the typewrap. Case closed. If methods from the base types are to be included in the typedef, they have to be included by explicit delegation.

It is logical for the "syntax view" of a type to mirror the type (in terms of structure) as closely as possible, and especially for a systems language like zig, but there are times when a different syntax view could be significantly more ergonomic or safe.

Typedef could be a way to offer alternative "syntax views" in zig, either by disabling syntax or enabling syntax sugar depending on the application.


2: Coercion control

Looking at typedef coercion through the lens of the shorthand TD(T,N,B,S,C), with n,m,l != 0, none of n,m,l being equal , and x reads as "not considered".

Lef => Right            // read, "Left" coerces to "Right"

TD(T,l,m,x,x) => TD(T,m,n,x,x)  // coercion down, one level
TD(T,m,n,x,x) => TD(T,n,0,x,x)  // coercion down, one level
TD(T,n,0,x,x) => TD(T,0,0,0,0)  // coercion one level down, to the base type (final level)
TD(T,l,m,x,x) => TD(T,0,0,0,0)  // coercion down, multiple levels

TD(T,n,0,x,x) => TD(T,m,n,x,x)  // coercion up, one level
TD(T,0,0,0,0) => TD(T,m,n,x,x)  // coercion up, multiple levels

As can be seen above, the hierarchical relationship between typedefs can be determined by looking at the entries N and B of the shorthand, and then typedef coercions can be categorized in various ways:

From that, you can have various distinct type behavior based on typedef. For instances tdd of type typedef(T,.Distinct): Whenever a coercion to or from tdd is requested, accept or reject (compile error) depending on the "category" of the given coercion.

A coercion rule set that maps almost exactly to this comment in the distinct types issue #1595, is:

Different rule sets can of course be applied to different use cases.


2: Reflection on typedef configs

Just like normal types support comptime reflection through @TypeInfo, so could typedef support comptime reflection on the config payload.

This would allow comptime tags to be implemented:

const TaggedType = typedef(T, .Tag{.txt="mytag"})
test "" {
  const tt = @as(TaggedType,getT());
  comptime assert(std.meta.hasTag(tt,"mytag")) // ok
  comptime assert(std.meta.hasTag(tt,"notthetag")) // fails
}

To add tags to a typedef as opposed to normal type, one would have to resort to typedef nesting.


4: Additional zig built-in types leveraging typedef

There are many builtin primitive types in zig that get special treatment with respect to syntax:

Some primitive types (primitive in the syntax support sense) are arguably even generic, like [N]SomeType, or ?SomeType.

The idea to be presented here is that a typedef that wraps a primitive base type will be very predictable for the compiler to deal with. The base type is a primitive, and the config payload is standardized because it is a builtin union member.

Definition of a typedef primitive Tp:

Mathematical types can often be modeled as numeric arrays, thus defining additional mathematical "primitive" types in zig could in many cases be done by wrapping a numeric array in a typedef and attach metadata (the typedef config).

Example, complex numbers:

const Complex = std.builtin.Complex(f64); // typedefprimitive([2]f64, .Complex) under the hood, with attached namespace

test "Complex number" {
  const c1 = Complex.init(1.0, 3.2);
  const c2 = Complex.init(-43, 12.5);
  const c3 = c1.pow(c2) + c1 * c2.conj() - (c1*c1+c1)/Complex.init(12.0, 15);
  const r3 = c3.re(); // or c3.re for that matter (syntax view)
  const i3 = c3.im();
}

Having arithmetic operators working for a complex number "typedef" is IMO perfectly feasible, as per the reasoning given in this issue comment by @skyfex. The functions or procedures necessary for complex number arithmetic does not need any control flow or to return any errors, could be defined as built-in functions, and cannot really be compared to operator overloading where everyone is able to customize how arithmetic operations behave in a multitude of ways.

The compiler would have enough information available to handle the cases where one or both of the operands in an arithmetic expression is a complex (typedef) primitive, and insert instructions corresponding to the appropriate builtin function.

This principle extends to fixed point types and matrices also, even despite the fact that arithmetic operation on these (mathematical) types may return a result with a different type than the type of either or both of the operands. Matrix4x4 * Matrix4x3 yields Matrix4x2. Still, this is unproblematic. The matrix dimensions are attached to the operands in form of the typedef config payload. Because of that, the compiler can ensure that the operands are compatible and determine the correct output "typedefprimitive".

In the codegen stage, these mathematical types would just be arrays.


5: Comptime validation of the specified base

Typedef is a comptime feature, so whenever a typedef is declared some comptime code can be triggered to run. This could include various validation functions that could be defined as builtins, basing it all on the existing reflection capabilities in zig.

Examples could be checking that a struct is a superset of another (embeds), or that the fields in a struct follow the convention set by a typedef config. Example:

const WidgetAnchorFlags = typedef(u4, .BitFlag){
  const SPARE : u4 = 0;
  const LEFT : u4 = 1;
  const RIGHT : u4 = 2;
  const TOP : u4 = 4;
  const BOTTOM : u4 = 8;
};

When the typedef config is .Bitflag, the compiler may enforce the existence of top level declarations with type equal to the base type, that each must have a unique value, that together must fully "cover" all the bits of the base type (which obviously has to be a uint), and where only one of the declarations can be something else than a power of two (the spare).

Additionally, as a bitflag could fit the definition of a typedef primitive established earlier, syntax sugar could be enabled, so that for example widgetAnchors.LEFT would map to widgetAnchors & WidgetAnchorFlags != 0. With primitive types (whether typedef based or not), syntax can be adjusted to improve usability.


Discussion and questions for the zig community:

jakwings commented 4 years ago

Question: How will peer type resolution work on these types?

ityonemo commented 4 years ago

There have been several languages where I have wanted this feature (nonexplixit conversion-blocked typealiases), including: C, Julia, elixir/erlang.

ghost commented 4 years ago

Question: How will peer type resolution work on these types?

I wish my understanding of the zig compiler was better, but is this example relevant to your question: const x = if(cond) @as(Typedef1,value) else @as(Typedef2,value); ?

If both Typedef1 and Typedef2 allow coercion down to the base type (depending on the coercion rule set associated with the typedef config used in their declaration), the result type is the typedef base shared by Typedef1 and Typedef2. const x : Base = value;

If Typedef1 and Typedef2 both wrap a Typedef3, they might coerce down to Typedef3. const x = @as(Typedef3,value);

Or you'd get a compile error. Typedef, just like a normal type, must always be comptime known.


Note: Feel free to add zig typedef pseudocode in the issue comments. That'll make it easier for me to answer.


There have been several languages where I have wanted this feature (nonexplixit conversion-blocked typealiases), including: C, Julia, elixir/erlang.

@ityonemo Are you able to present some short examples/use cases?

ghost commented 4 years ago

Just want to specify something:

const Td = typedef(u32, .Alias); // does NOT create a new ZigType for IR.

test ".." {
  const x : u32 = 5;
  const y = @as(Td,x); // y has ZigType u32, but the ZigValue has a typedef id != 0
}

Typedef from the perspective of Zig IR is just an id field attached to ZigValue that is used to separate between between otherwise equal ZigTypes during compile time, and where the id points to metadata in a table (one row for each typedef declaration) that can influence compiler decisions.

My assumption (which I hope is sufficiently accurate) is that:

Typedef is just a way to add structured metadata to ZigValues in addition to the existing types that can parametrize the compiler "decision process" in useful ways.

jakwings commented 4 years ago

If Typedef1 and Typedef2 both wrap a Typedef3, they might coerce down to Typedef3.

If the base type is not a primitive number type then it seems no problem. Doesn't it conflict with the current behavior when the base type is a number type? say u8 will coerce to u32, now what happen for alias(u32) + u32, alias(u32) + u8, alias(u8) + u32, alias(u8) + alias(u32), alias(u8) + alias(alias(u8)),alias(u8) + alias(alias(u32)), ...?

Other questions (just curious about the details and complexity):

ghost commented 4 years ago

If Typedef1 and Typedef2 both wrap a Typedef3, they might coerce down to Typedef3.

If the base type is not a primitive number type then it seems no problem. Doesn't it conflict with the current behavior when the base type is a number type? say u8 will coerce to u32, now what happen for alias(u32) + u32, alias(u32) + u8, alias(u8) + u32, alias(u8) + alias(u32), alias(u8) + alias(alias(u8)),alias(u8) + alias(alias(u32)), ...?

The intended semantics are that the "coercion" of typedefs and coercion of regular types are parallel processes as far as possible.

For an operation (assignment, binary op, cast, ..) with input ZigValue(s) that may be or may not be typedef:


Behavior for const x = t1 + t2; where @TypeOf(t1)==T1 and @TypeOf(t2)==T2

Behavior for const x :T3 = t1 + t2:


* Coercion between distinct types looks like an unsafe variant of C++'s `dynamic_cast` (no matter how many levels of coercion is allowed). It seems only useful when you know how the internal work. So I guess this is not designed for public API and users must not do such coercion by themselves?

The idea is that if you have a base type T that is used in a lot of functions, then it's convenient that typedef(T, .Distinct) can also be used in those functions.

Still, if you you make a function fn(x: MyTypedefDistinct) ... then the x parameter to this function cannot be the base of MyTypedefDistinct, so it's "convenient" in one direction, and "safe" in the other. direction.

Different coercion behavior could be defined as well. I've been thinking there could be resource handle typedefs (config = .Resourcehandle) that wrap integers, but do not coerce to the base type, so you couldn't add two const GlInt = typedef(u32, .ResourceHandle) for example.

* It seems most of the behaviors are determined by how the reflection on typedef configs works? What is the best default?

Indeed. The idea is that typedef is a broad (but fixed) framework on the compiler side (typedef configs being a closed set), while for the user typedef gives lots of configuration options for different use cases.

ityonemo commented 4 years ago

Yes, let me give a examples of what I'm talking about. Note that both Julia and Elixir have different relationships with their type systems than most other languages (including Zig)

In Julia, one thing that I implemented was a Galois Field type. Note that Julia's relationship with types is that they are used to square aggressive compile-time optimization with user abstraction and common mathematical libraries (I don't necessarily think this is something that Zig should shoot for). As a specific example, to do reed-solomon decoding in Julia, I could use my minimally implemented GF type and leverage the Julia standard library's matrix solver without having to rewrite LRU decomposition for GF arithmetic. At the same time, someone else easily subbed in patches to julia's matrix solver so that it would run normal FP incredibly efficiently on the awkward Intel Xeon Phi architecture, and, in theory you could have run that code side by side with my code and Julia would select the GF solver or the Xeon Phi solver based on the type of the data provided to the function call.

This Galois Field type was a direct GF(2^n) implementation backed by an unsigned integer of the appropriate size (usually 8). It would have been nice to be able to specify that "this was indeed an integer" by subtyping, allowing it to transparently inherit certain operations (like << or >>, bitwise &, |, etc) while overloading specific operations (+, -, *). I could get around this by explicitly doing conversions back and forth but I was never sure what was compiled down to a no-op or not (without checking the assembly code, which luckily, is very easy to do in julia).

BTW, I suspect zig as being potentially extremely useful for crypto purposes because by my naive understanding it's easy to accidentally write code in C that has non-constant execution, whereas zig doesn't hide function calls and perform as many sneaky tricks with code rewriting, so GF arithmetic might be a useful library down the line. Here's the sort of things that I would want to see for such a feature (let's call our hypothetical type g8):

<some code of typedeffing g8 to u8>
const my_g8a:g8 = @as(g8, 0xFF);   #okay to not have direct coercion
const my_g8b:g8 = 0xFF;                   #comptime coercion could be a nice-to-have
var my_u8v:u8 = 0xFF;
var my_g8v:g8 = my_u8v; # <--  forbidden!  should be an error.
var my_g8x:g8 = my_g8a ^ my_g8b;  # allowed, by inheritance from u8
var my_g8p:g8 = my_g8a + my_g8b; # allowed, even if stupid.
# or it could specifically opt in to certain operations via some sort of mechanism, as discussed above
var nope:g8 = my_g8a ^ my_u8v  # always disallowed no matter what.
var yes:g8 = my_g8a ^ @as(g8, my_u8v)

In elixir, the language is strongly but dynamically typed. There is, however, static typechecking. One can make a typealias, but effectively these are solely used for documentation.

so for example, in the stdlib, the type String.t (https://github.com/elixir-lang/elixir/blob/v1.10.2/lib/elixir/lib/string.ex#L277) is strictly equivalent to the primitive type binary. You're supposed to use String.t to indicate that everything in the string is 1) printable and 2) unicode-encoded, so it's a strict subset of what constitutes binary (contiguous range of memory of size 8 bits * n). Unlike Zig, I don't expect much out of my erlang typesystem in terms of compile-time guarantees, but it would at least be nice for the static typechecking to warn me if I try to pass something annotated as binary into something annotated as String.t. But the erlang typesystem doesn't have this typedef notion, which would at least give me some sort of static check on the programming intent.

I don't know how that usecase translates into Zig, but it does seem vaguely similar to the proposal of being able to do units of measure.

ghost commented 4 years ago

@ityonemo

Note that Julia's relationship with types is that they are used to square aggressive compile-time optimization with user abstraction and common mathematical libraries (I don't necessarily think this is something that Zig should shoot for).

Zig typedef wouldn't measure up to Julia in this regard as zig won't have user configurable operator overloading, but it seems like zig typedef could get you parts of the way there.

With typedef, if Library-A and Library-B both use [2]f64 to represent a complex number, then it's a bit easier to use both libraries interchangeably. It's slightly easier to cast between typedefs that are runtime equivalent than converting between user types, although all it takes is writing a function.

It would have been nice to be able to specify that "this was indeed an integer" by subtyping, allowing it to transparently inherit certain operations (like << or >>, bitwise &, |, etc) while overloading specific operations (+, -, *). I could get around this by explicitly doing conversions back and forth but I was never sure what was compiled down to a no-op or not (without checking the assembly code, which luckily, is very easy to do in julia).

To stick with the premise of not having user configurable operators in zig, I went with these principles for the typedef proposal:

For your Galois Field type, below is an example using the .Distinct typedef config. It provides some of the behavior you want, but you would need to wrap operator expressions in casts. Functions and methods could have the correct output type directly, and casts to/from a typedef would by definition be a no-op at runtime.

const g8_distinct = typedef(u8,.Distinct);
const my_g8a : g8_distinct = @as(g8_distinct, 0xFF); // OK
const my_g8b : g8_distinct = 0xFF; // Fails. cannot coerce upwards to distinct
var my_u8v : u8 = 0xFF;
var my_g8v : g8_distinct = my_u8v; // Fails. cannot coerce "upwards" to distinct
var my_g8x : g8_distinct = my_g8a ^ my_g8b; // fails. expression coerces down to u8. wrap in cast
var my_g8p : g8_distinct = my_g8a + my_g8b; // fails. expression coerces down to u8. wrap in cast
var my_g8o : g8_distinct = @as(g8_distinct, my_g8a + my_g8b); // Ok

var nope : g8_distinct = my_g8a ^ my_u8v // fails. expression coerces down to u8
var nope2 : u8 = my_g8a ^ my_u8v; // ok. expression coerces down to u8
var yes : g8_distinct = my_g8a ^ @as(g8, my_u8v) // fails. expression coerces down to u8

.MeasureGroup and .Measure might work, but with downsides that you'd need to specify the type of all the identifiers you introduce, and that another "measure" could be added to the same group at will. Could imagine something like this though:

const g8_new = typedef(typedef(u8, .MeasureGroup), .ClosedMeasure); // this
const g8_new = typedef(u8, .DistinctConfigurable{.mode=.Strict);   // or this

"Closed measures" or "strict distincts" could have the following semantics:

so for example, in the stdlib, the type String.t (https://github.com/elixir-lang/elixir/blob/v1.10.2/lib/elixir/lib/string.ex#L277) is strictly equivalent to the primitive type binary. You're supposed to use String.t to indicate that everything in the string is 1) printable and 2) unicode-encoded, so it's a strict subset of what constitutes binary (contiguous range of memory of size 8 bits * n). Unlike Zig, I don't expect much out of my erlang typesystem in terms of compile-time guarantees, but it would at least be nice for the static typechecking to warn me if I try to pass something annotated as binary into something annotated as String.t. But the erlang typesystem doesn't have this typedef notion, which would at least give me some sort of static check on the programming intent.

I actually think that typedef would work nicely here as well.


const Utf8Str = typedef( []const u8, .DistinctConfigurable{.mode=SelfInit}){ // attached namespace
    fn init(str: []const u8) Utf8Str {
        // do validation with std.debug.assert that all codepoints are valid
        return @as(Utf8Str,str); // cast is allowed in "self" scope
    }
}

test "typedef safe init" {
    const str = "hello";
    const str2: Utf8Str = str; // (typedef) compile error. coercion not accepted
    const str3: Utf8Str = @as(Utf8Str,str); // typedef compile error. cast to distinct type outside self scope is not supported
    const str4 : Utf8Str = Utf8Str.init(str); // accepted. function in typedef scope does validation. (assertion, but could also possibly return an error

    const rawstr : []const u8 = str4; // coerce one step downwards is ok, as for other distinct typedefs
    const rawstr2 = @as([]const u8, str4); // also ok. ALWAYS allow casts from typedef to its basetype.
}
``
hamad-almamari commented 3 years ago

So much complicated with my respect. We dont want zig to turn into c++.

matu3ba commented 2 years ago
  1. Do you have a realistic examples, which are simplifies by this proposal?
  2. Why does this need another keyword instead of a comptime-implementation in libstd?
  3. The functions of libraries A and B still must agree on the same conforming interface and data layouts to be interchangeable and adding comptime-boilerplate on top doesnt change that.
  4. Recursive replacement of method implementation (ie deriving method interfaces as specialization) is essentially what traits in Rust do. This requires to track all possible derivations in memory, even though they are unused, and makes things slow. How does this approach try to compensate that?
  5. You claim "Overall, I'd say this proposal is a relatively logical and natural addition to zig, as there are no "new" concepts introduced. It's rather a merger of existing ones:". Please provide a library implementation first to justify adding compiler complexity. Alternatively, provide an argumentation why this can not be in libstd or which parts could be part of libstd.
KilianHanich commented 3 months ago
  1. Do you have a realistic examples, which are simplifies by this proposal?

While I can't comment on a lot of other stuff, here an example of what would be simplified by this proposal for users, although it doesn't need a lot of it, just the "distinct types" part:

fn createUser(uid: std.posix.uid_t, gid: std.posix.gid_t) !void;

std.posix.uid_t and std.posix.gid_t are under the hood just alternative names for u16. So as far as a computer (and the compiler) is concerned, swapping the parameters around while calling the function is not a problem, but logically that doesn't make any sense.

If you could make these types distinct at comptime, this could be a compile error, preventing quite a few bugs from happening.

mlugg commented 3 months ago

The convention is to use non-exhaustive enums for such use cases, e.g. const uid_t = enum(u16) { _ } -- see also https://github.com/ziglang/zig/issues/1595#issuecomment-2251554904. I think this proposal should probably be closed, being essentially a more complex version of #1595.

tmccombs commented 3 months ago

The convention is to use non-exhaustive enums for such use cases

What if the underlying type is itself a complex struct that can't be used as an enum type?

KilianHanich commented 3 months ago

The convention is to use non-exhaustive enums for such use cases

What if the underlying type is itself a complex struct that can't be used as an enum type?

How would you even go and define such a distinct type? Do you have an example where a different language implemented this for essentially arbitrary types because I fail to imagine how one could implement this (no matter if it's wanted or not).

mlugg commented 3 months ago

What if the underlying type is itself a complex struct that can't be used as an enum type?

It's very rare that you need to do this -- distinct types are pretty much always for integer handles. If you do need it somewhere, this use case is served by a wrapper struct (perhaps extern struct or packed struct depending on the context).

tmccombs commented 3 months ago

In my experience, it is fairly common for string and array types, most notably []u8 in zig.

How would you even go and define such a distinct type?

The same way you would for primitive types.

Do you have an example where a different language implemented this for essentially arbitrary type