Closed ghost closed 3 months ago
Question: How will peer type resolution work on these types?
There have been several languages where I have wanted this feature (nonexplixit conversion-blocked typealiases), including: C, Julia, elixir/erlang.
Question: How will peer type resolution work on these types?
I wish my understanding of the zig compiler was better, but is this example relevant to your question: const x = if(cond) @as(Typedef1,value) else @as(Typedef2,value);
?
If both Typedef1
and Typedef2
allow coercion down to the base type (depending on the coercion rule set associated with the typedef config used in their declaration), the result type is the typedef base shared by Typedef1
and Typedef2
.
const x : Base = value;
If Typedef1 and Typedef2 both wrap a Typedef3, they might coerce down to Typedef3.
const x = @as(Typedef3,value);
Or you'd get a compile error. Typedef, just like a normal type, must always be comptime known.
Note: Feel free to add zig typedef pseudocode in the issue comments. That'll make it easier for me to answer.
There have been several languages where I have wanted this feature (nonexplixit conversion-blocked typealiases), including: C, Julia, elixir/erlang.
@ityonemo Are you able to present some short examples/use cases?
Just want to specify something:
const Td = typedef(u32, .Alias); // does NOT create a new ZigType for IR.
test ".." {
const x : u32 = 5;
const y = @as(Td,x); // y has ZigType u32, but the ZigValue has a typedef id != 0
}
Typedef from the perspective of Zig IR is just an id field attached to ZigValue
that is used to separate between between otherwise equal ZigType
s during compile time, and where the id points to metadata in a table (one row for each typedef declaration) that can influence compiler decisions.
My assumption (which I hope is sufficiently accurate) is that:
ZigValue
sTypedef is just a way to add structured metadata to ZigValue
s in addition to the existing types that can parametrize the compiler "decision process" in useful ways.
If Typedef1 and Typedef2 both wrap a Typedef3, they might coerce down to Typedef3.
If the base type is not a primitive number type then it seems no problem. Doesn't it conflict with the current behavior when the base type is a number type? say u8
will coerce to u32
, now what happen for alias(u32) + u32
, alias(u32) + u8
, alias(u8) + u32
, alias(u8) + alias(u32)
, alias(u8) + alias(alias(u8)),
alias(u8) + alias(alias(u32)), ...?
Other questions (just curious about the details and complexity):
Does peer type resolution only happen between aliases? There are so many settings so it is confusing.
Can peer type resolution happen between distinct type with the same base?
Coercion between distinct types looks like an unsafe variant of C++'s dynamic_cast
(no matter how many levels of coercion is allowed). It seems only useful when you know how the internal work. So I guess this is not designed for public API and users must not do such coercion by themselves?
It seems most of the behaviors are determined by how the reflection on typedef configs works? What is the best default?
If Typedef1 and Typedef2 both wrap a Typedef3, they might coerce down to Typedef3.
If the base type is not a primitive number type then it seems no problem. Doesn't it conflict with the current behavior when the base type is a number type? say
u8
will coerce tou32
, now what happen foralias(u32) + u32
,alias(u32) + u8
,alias(u8) + u32
,alias(u8) + alias(u32)
,alias(u8) + alias(alias(u8)),
alias(u8) + alias(alias(u32)), ...?
The intended semantics are that the "coercion" of typedefs and coercion of regular types are parallel processes as far as possible.
For an operation (assignment, binary op, cast, ..) with input ZigValue
(s) that may be or may not be typedef:
ZigValue
has yet to be determined.ZigValue
has yet to be determined.Behavior for const x = t1 + t2;
where @TypeOf(t1)==T1 and @TypeOf(t2)==T2
CASE: T1 == typedef(u32,.Distinct) , T2 == typedef(u8,.Distinct)
:
x
becomes u32
as per existing peer type resolution, but the typedef ID of the output ZigValue
has yet to be determined.T1
and T2
are base types or typedef primitives. Distinct typedefs do not mark their base type as a typedef primitive, like typedef([2]f64, .Complex)
would, so this means the add operation is unsupported for .Distinct
typedefs directly, but .Distinct
allow downward casts in a single step, and to avoid having to raise an error, auto typedef coercion is attempted first.T1
and T2
, which are (pure) primitive types and implicitly supported since the instruction was already accepted in the step prior. (Note, it doesn't matter that base of T1
and T2
are different to from the typedef perspective) x
has the plain output type u32
, similar to how 42.4 + 2
coerces "down" to a float.CASE: T1 == typedef(u32,.Distinct) , T2 == typedef(typedef(u8,.Distinct), .Distinct)
:
T1
and T2
, the operands would still not have types supported for the given operation. This is intentional, but it could also be possible to make a variant of typedef distincts that will attempt to coerce downwards multiple times, or a distinct "primitive" typedef (that has to wrap float or int) that are supported by operators. This will hopefully be explored more in the "Units of Measure" issue, with the dedicated Measure
and MeasureGroup
configs.CASE: T1 == typedef(u32,.Distinct) , T2 == u8
:
T2 = typedef(u8, .Distinct)
. T1
will have to coerce down to the base type before the operation is valid, and this determines the output as well.Behavior for const x :T3 = t1 + t2
:
const x_tmp = t1 + t2; const x : T3 = x;
For (pure) base types the latter statement works if T3 == @TypeOf(x_tmp)
. For typedefs, the latter works if the base of T3
is equal to @TypeOf(x_tmp)
, and if the coercion rules associated with T3
allow
coercion from other typedefs/types. * Coercion between distinct types looks like an unsafe variant of C++'s `dynamic_cast` (no matter how many levels of coercion is allowed). It seems only useful when you know how the internal work. So I guess this is not designed for public API and users must not do such coercion by themselves?
The idea is that if you have a base type T
that is used in a lot of functions, then it's convenient that typedef(T, .Distinct)
can also be used in those functions.
Still, if you you make a function fn(x: MyTypedefDistinct) ...
then the x
parameter to this function cannot be the base of MyTypedefDistinct
, so it's "convenient" in one direction, and "safe" in the other. direction.
Different coercion behavior could be defined as well. I've been thinking there could be resource handle typedefs (config = .Resourcehandle
) that wrap integers, but do not coerce to the base type, so you couldn't add two const GlInt = typedef(u32, .ResourceHandle)
for example.
* It seems most of the behaviors are determined by how the reflection on typedef configs works? What is the best default?
Indeed. The idea is that typedef is a broad (but fixed) framework on the compiler side (typedef configs being a closed set), while for the user typedef gives lots of configuration options for different use cases.
Yes, let me give a examples of what I'm talking about. Note that both Julia and Elixir have different relationships with their type systems than most other languages (including Zig)
In Julia, one thing that I implemented was a Galois Field type. Note that Julia's relationship with types is that they are used to square aggressive compile-time optimization with user abstraction and common mathematical libraries (I don't necessarily think this is something that Zig should shoot for). As a specific example, to do reed-solomon decoding in Julia, I could use my minimally implemented GF type and leverage the Julia standard library's matrix solver without having to rewrite LRU decomposition for GF arithmetic. At the same time, someone else easily subbed in patches to julia's matrix solver so that it would run normal FP incredibly efficiently on the awkward Intel Xeon Phi architecture, and, in theory you could have run that code side by side with my code and Julia would select the GF solver or the Xeon Phi solver based on the type of the data provided to the function call.
This Galois Field type was a direct GF(2^n) implementation backed by an unsigned integer of the appropriate size (usually 8). It would have been nice to be able to specify that "this was indeed an integer" by subtyping, allowing it to transparently inherit certain operations (like << or >>, bitwise &, |, etc) while overloading specific operations (+, -, *). I could get around this by explicitly doing conversions back and forth but I was never sure what was compiled down to a no-op or not (without checking the assembly code, which luckily, is very easy to do in julia).
BTW, I suspect zig as being potentially extremely useful for crypto purposes because by my naive understanding it's easy to accidentally write code in C that has non-constant execution, whereas zig doesn't hide function calls and perform as many sneaky tricks with code rewriting, so GF arithmetic might be a useful library down the line. Here's the sort of things that I would want to see for such a feature (let's call our hypothetical type g8):
<some code of typedeffing g8 to u8>
const my_g8a:g8 = @as(g8, 0xFF); #okay to not have direct coercion
const my_g8b:g8 = 0xFF; #comptime coercion could be a nice-to-have
var my_u8v:u8 = 0xFF;
var my_g8v:g8 = my_u8v; # <-- forbidden! should be an error.
var my_g8x:g8 = my_g8a ^ my_g8b; # allowed, by inheritance from u8
var my_g8p:g8 = my_g8a + my_g8b; # allowed, even if stupid.
# or it could specifically opt in to certain operations via some sort of mechanism, as discussed above
var nope:g8 = my_g8a ^ my_u8v # always disallowed no matter what.
var yes:g8 = my_g8a ^ @as(g8, my_u8v)
In elixir, the language is strongly but dynamically typed. There is, however, static typechecking. One can make a typealias, but effectively these are solely used for documentation.
so for example, in the stdlib, the type String.t
(https://github.com/elixir-lang/elixir/blob/v1.10.2/lib/elixir/lib/string.ex#L277) is strictly equivalent to the primitive type binary
. You're supposed to use String.t
to indicate that everything in the string is 1) printable and 2) unicode-encoded, so it's a strict subset of what constitutes binary
(contiguous range of memory of size 8 bits * n). Unlike Zig, I don't expect much out of my erlang typesystem in terms of compile-time guarantees, but it would at least be nice for the static typechecking to warn me if I try to pass something annotated as binary
into something annotated as String.t
. But the erlang typesystem doesn't have this typedef notion, which would at least give me some sort of static check on the programming intent.
I don't know how that usecase translates into Zig, but it does seem vaguely similar to the proposal of being able to do units of measure.
@ityonemo
Note that Julia's relationship with types is that they are used to square aggressive compile-time optimization with user abstraction and common mathematical libraries (I don't necessarily think this is something that Zig should shoot for).
Zig typedef wouldn't measure up to Julia in this regard as zig won't have user configurable operator overloading, but it seems like zig typedef could get you parts of the way there.
With typedef, if Library-A and Library-B both use [2]f64
to represent a complex number, then it's a bit easier to use both libraries interchangeably. It's slightly easier to cast between typedefs that are runtime equivalent than converting between user types, although all it takes is writing a function.
It would have been nice to be able to specify that "this was indeed an integer" by subtyping, allowing it to transparently inherit certain operations (like << or >>, bitwise &, |, etc) while overloading specific operations (+, -, *). I could get around this by explicitly doing conversions back and forth but I was never sure what was compiled down to a no-op or not (without checking the assembly code, which luckily, is very easy to do in julia).
To stick with the premise of not having user configurable operators in zig, I went with these principles for the typedef proposal:
For your Galois Field type, below is an example using the .Distinct
typedef config. It provides some of the behavior you want, but you would need to wrap operator expressions in casts. Functions and methods could have the correct output type directly, and casts to/from a typedef would by definition be a no-op at runtime.
const g8_distinct = typedef(u8,.Distinct);
const my_g8a : g8_distinct = @as(g8_distinct, 0xFF); // OK
const my_g8b : g8_distinct = 0xFF; // Fails. cannot coerce upwards to distinct
var my_u8v : u8 = 0xFF;
var my_g8v : g8_distinct = my_u8v; // Fails. cannot coerce "upwards" to distinct
var my_g8x : g8_distinct = my_g8a ^ my_g8b; // fails. expression coerces down to u8. wrap in cast
var my_g8p : g8_distinct = my_g8a + my_g8b; // fails. expression coerces down to u8. wrap in cast
var my_g8o : g8_distinct = @as(g8_distinct, my_g8a + my_g8b); // Ok
var nope : g8_distinct = my_g8a ^ my_u8v // fails. expression coerces down to u8
var nope2 : u8 = my_g8a ^ my_u8v; // ok. expression coerces down to u8
var yes : g8_distinct = my_g8a ^ @as(g8, my_u8v) // fails. expression coerces down to u8
.MeasureGroup
and .Measure
might work, but with downsides that you'd need to specify the type of all the identifiers you introduce, and that another "measure" could be added to the same group at will. Could imagine something like this though:
const g8_new = typedef(typedef(u8, .MeasureGroup), .ClosedMeasure); // this
const g8_new = typedef(u8, .DistinctConfigurable{.mode=.Strict); // or this
"Closed measures" or "strict distincts" could have the following semantics:
g8_new
, then the other operand must also be g8_new
(in terms of typedef ID)g8_new
, the result will also be g8_new
.so for example, in the stdlib, the type String.t (https://github.com/elixir-lang/elixir/blob/v1.10.2/lib/elixir/lib/string.ex#L277) is strictly equivalent to the primitive type binary. You're supposed to use String.t to indicate that everything in the string is 1) printable and 2) unicode-encoded, so it's a strict subset of what constitutes binary (contiguous range of memory of size 8 bits * n). Unlike Zig, I don't expect much out of my erlang typesystem in terms of compile-time guarantees, but it would at least be nice for the static typechecking to warn me if I try to pass something annotated as binary into something annotated as String.t. But the erlang typesystem doesn't have this typedef notion, which would at least give me some sort of static check on the programming intent.
I actually think that typedef would work nicely here as well.
const Utf8Str = typedef( []const u8, .DistinctConfigurable{.mode=SelfInit}){ // attached namespace
fn init(str: []const u8) Utf8Str {
// do validation with std.debug.assert that all codepoints are valid
return @as(Utf8Str,str); // cast is allowed in "self" scope
}
}
test "typedef safe init" {
const str = "hello";
const str2: Utf8Str = str; // (typedef) compile error. coercion not accepted
const str3: Utf8Str = @as(Utf8Str,str); // typedef compile error. cast to distinct type outside self scope is not supported
const str4 : Utf8Str = Utf8Str.init(str); // accepted. function in typedef scope does validation. (assertion, but could also possibly return an error
const rawstr : []const u8 = str4; // coerce one step downwards is ok, as for other distinct typedefs
const rawstr2 = @as([]const u8, str4); // also ok. ALWAYS allow casts from typedef to its basetype.
}
``
So much complicated with my respect. We dont want zig to turn into c++.
- Do you have a realistic examples, which are simplifies by this proposal?
While I can't comment on a lot of other stuff, here an example of what would be simplified by this proposal for users, although it doesn't need a lot of it, just the "distinct types" part:
fn createUser(uid: std.posix.uid_t, gid: std.posix.gid_t) !void;
std.posix.uid_t
and std.posix.gid_t
are under the hood just alternative names for u16
. So as far as a computer (and the compiler) is concerned, swapping the parameters around while calling the function is not a problem, but logically that doesn't make any sense.
If you could make these types distinct at comptime, this could be a compile error, preventing quite a few bugs from happening.
The convention is to use non-exhaustive enums for such use cases, e.g. const uid_t = enum(u16) { _ }
-- see also https://github.com/ziglang/zig/issues/1595#issuecomment-2251554904. I think this proposal should probably be closed, being essentially a more complex version of #1595.
The convention is to use non-exhaustive enums for such use cases
What if the underlying type is itself a complex struct that can't be used as an enum type?
The convention is to use non-exhaustive enums for such use cases
What if the underlying type is itself a complex struct that can't be used as an enum type?
How would you even go and define such a distinct type? Do you have an example where a different language implemented this for essentially arbitrary types because I fail to imagine how one could implement this (no matter if it's wanted or not).
What if the underlying type is itself a complex struct that can't be used as an enum type?
It's very rare that you need to do this -- distinct types are pretty much always for integer handles. If you do need it somewhere, this use case is served by a wrapper struct
(perhaps extern struct
or packed struct
depending on the context).
In my experience, it is fairly common for string and array types, most notably []u8
in zig.
How would you even go and define such a distinct type?
The same way you would for primitive types.
Do you have an example where a different language implemented this for essentially arbitrary type
There is a related proposal for distinct types, but it doesn't explore all the opportunities that a more generic form of distinct types could provide. This proposal is an attempt to do just that.
It is also an attempt at finding a new core language abstraction in zig which hopefully is both simple (conceptually) and expressive enough to be worthwhile.
'typedef' in zig could become a common platform for any additional or improved feature that is "comptime only" in nature. This in contrast to implementing each such comptime feature in separation and from scratch.
Being a "comptime only" feature, typedef wouldn't add much (if any) complexity to the codegen stage, and rather require changes to the earlier compilation stages instead.
That there is a similar (though more limited) typedef feature in C will hopefully help make typedef in zig discoverable for newcomers.
Overall, I'd say this proposal is a relatively logical and natural addition to zig, as there are no "new" concepts introduced. It's rather a merger of existing ones:
Use cases:
Can be implemented on top of typedef:
To avoid making this text even longer than it is, I will mostly refer back from comments in other issues to cover the use cases that way, instead of writing in detail about all the use cases here. Other times, a feature might map so trivially to typedef that it is natural to include it in the summary section below.
The focus of this post is instead on the principles and framework that the implementations for those use cases would be based on.
Usage perspective:
Introducing the typedef syntax:
Then a more elaborate (but very generic) example that is explained in detail below:
Read the above as:
Td
is a (runtime) type alias of the typeT
. They have the exact same runtime representation.@as(,)
fromTd
toT
is always possible, while casting fromT
toTd
may be restricted if desirable. The default is that casting is always possible in both directions.T
is the base (type) ofTd
Td
is comptime distinct fromT
. They can be separated from each other at compile time.T
can be any type. Primitive, struct, enum, array... doesn't matter.Td
has an (optionally) attached scope for top level declarations (no fields!)Td
was omitted:Td
andT
are the sametd.method()
points to function inT
Td.member
does not point toT.member
. The namespaces ofTd
andT
are separatetd.method()
points to a function inTd
T
still work, e.gT.origMethod(td,param)
td.field
,td.len
,td[2]
are still accessible, (as ift
was substituted fortd
), no matter whetherTd
had an attached namespace or not.td
is used in an expression, the compiler will try to auto coercetd
to the type the expression expects.T
happens to bef64
, thentd + td
might coerce tof64 + f64
and the expression returnsf64
.Td
is configured with an instance of a zig built-in tagged union, here.Tdconf
.void
is sufficient, and you can write.Tdconf
(instead of.Tdconf2{.param = value}
)The different typedef configs allow for user specified comptime behavior regarding coercion, unified call syntax, member access, available syntax, restricted syntax, custom validation, etc.
Creating a standardized way to get more control over these compile time aspects is what allows other features to be built on top of typedef.
One typedef config will usually only affect a small subset of comptime behavior at a time. E.g
typedef(u32, .Distinct)
would be orthogonal totypedef(MyStruct, .Tag{.txt="mytag"})
with regards to which compilation behavior they would parametrize or modify, even though they are both ultimately based on typedef.Continuing the list:
.Tdconf
might affectTd
and/ortd
in the following ways:T
in the typedef declaration. E.g some configs might demandT
to be a numeric type.td
to other typedefs also wrappingT
td
toT
directlyT
(wrapped in a typedef or otherwise) toTd
td
that is enabledtd
in expressionsT
and/orTd
( at the declaration time ofTd
)And finally:
Tdconf
supports comptime reflection, so user code can inspect the data contained in the config instance for a particular typedef type or instance. Td-configs are just union members after all.@typeInfo
exposes the typedef data, or a new builtin could be introduced:@typedefInfo
Implementation perspective:
Representation of typedef:
Let a
typedef
"type" be represented by the shorthandTD(T,N,B,S,C)
.T : type
= base zig typeN : u32
= typedef id numberB : u32
= base typedef id number.S : u32/pointer
= attached scope, pointer/idC : u32/pointer
= builtin tagged union config pointer/idWhere
TD(T,0,0,0,0)
is "reserved" as being equivalent to the plain zig typeT
T
is always a plain zig type, whether the typedef declaration "wrapped" a normal zig type or another typedef.N
is a unique (nonzero) ID that is assigned per declaration of a typedef.N
may be zero only when the shorthand represents a plain zig type.B
is included to enable nesting of typedefs, withB
equal to the id (N
) of the base typedef, or withB = 0
if the base is a plain zig type.S
is an ID or pointer corresponding to the attached namespace of a typedef declaration.S == 0
(null) if there was no namespace attached.C
stores an ID or pointer corresponding to a typedef config (created per declaration, but there exists only one instance per unique config) whereC == 0
(null) only when the shorthand represents a plain zig type, not a typedef.Example:
Zig compiler and typedef:
I don't know as much about the zig compiler as others here, so I have to go by assumptions and hope they are valid or at least not completely off.
My assumptions are that:
TD(T,N,B,S,C)
is a sufficient representation of typedef declarations for a powerful typedef feature in zig:N
andB
are used for separation of typedefs wrapping the same base type, and for controlling coercion mechanics.S
adds flexibility and control with regard to namespaces and member access.C
adds comptime logic parametrization that is standardized in the form of a union, allowing ..TD(T,N,B,S,C)
ZigValue
has a fieldtdId
for the typedef idN
.ZigValue
is a plain zig type (not a typedef),tdId
is equal to zero.tdId
field of theZigValue
.tdId
to zero in the result value.Another look at typedef declarations
The following are the results from the typedef declarations and casts above, assuming
T
is a base type:T
is aZigValue
of type "type" or "meta", with its typedef ID equal to zeroTd
is aZigValue
exactly the same asT
, except that the typedef IDn
is non-zero and uniquely generated for this particular declarationTdTd
is aZigValue
exactly the same asT
, but has a non-zero typedef idm
, withm != n
..Tdconf
once to the collection, and mark it with id or pointertc
. Added once only due to memoization of equivalent payloads forTd
andTdTd
.const FOO = 100
to the collection, and mark it with id or pointers
Td
, TD(T,N,B,S,C) => TD(T
,n
,0,s
,tc
)TdTd
, TD(T,N,B,S,C) => TD(T
,m
,n
,0,tc
)Casting and coercing between typedefs essentially involve changing the typedef id associated with a
ZigValue
according to rules that are parametrized by the typedef configs.Handling of assignments, typedef perspective only.
t
is aZigValue
of typeT
, with typedef id equal to 0:myTd1
:Td
to see if it's non-zero, which is the case here.0 -> n
in terms of the required change in typedef id, depending on the coercion rules defined by the typedef config.tc
in the table using typedef idn
, and can make a decision.myTd1
will be aZigValue
of typeT
, with a typedef id equal ton
.myTd2
Td
to determine the correct action.myTdTd
n -> m
. In this case, it can be taken into account that the base-typedef id ofTdTd
is non-zero and equal ton
, so the hierarchical relation betweenTdTd
andTd
can be determined by comparing these numbers. The decision might involve querying typedef configs of bothTd
andTdTd
.expr
T
is not necessarily a struct, andTd
was declared with an attached scope, overriding any scope the base type might define.Td
is just aZigValue
of typeT
with non-zero typedef ID. The declarationTd.FOO
is not directly associated withTd
. The compiler must replaceTd.FOO
with something akin togetTypedefStructDecl(s,"FOO")
, by looking up the attached scopes
in the "typedef table".Mapping typedef to use cases:
The capabilities of typedefs come in form of
As specified before, all typedef configs are members of a built-in union. Below is a subset of possible typedef configs that would solve some existing github issues, while spanning enough of the "typedef framework" to get decent initial test coverage of it.
Only if these four typedef configs are possible to include in an architecturally sound manner should additional features (more typedef configs) be considered.
Core typedef configs:
.Alias : void
(1)const T2 = T1
, which creates a "dumb alias" whereT1
andT2
automatically coerce to each other without needing any explicit casting. Still, a useful config to have when using typedef solely for the purpose of defining methods..Distinct : void
(2).Tag: []const u8
(3).Complex : void
(4)+
and*
, that under the hood is a typedef wrapping a numeric array of length 2. Configs like these only work together with a specific subset of types in typedef declarations.Additional typedef configs: Extensive list of possible or theoretically viable typedef configs, where not all of them might be worth it or possible to include.
.Matrix : [2]usize (Dimensions)
(4).Complex
..ResourceHandle : void
(2).Measure : void
(2, 4).MeasureGroup : void
(2, 4).ConfigurableDistinct : (DistinctConfig : enum)
(2).Distinct
provides.Encapsulated : (EncapsulationConfig : enum)
(1).FixedPointInt : usize (Amount of fractional bits)
(4).NamedArray : []const u8
(1, 4)const Color = typedef([4]u8, .NamedVector{.fields="r,g,b,a"})
const Pos3d = typedef([3]f64, .NamedVector{.fields="x,y,z"})
.Embedding : type
(4, 5)1: Attached namespace and "syntax views" of a type
Currently, primitive types cannot have methods, even though the only difference between methods and functions are the syntax used for calling them. This becomes an artificial restriction on primitive types.
Also, you cannot add methods to 3rd party user defined types without modifying the 3rd party source code.
With typedef, given a type
T
you want to define a method on, whereT
happens to be either a primitive or a 3rd party struct type: Simply cast or coerceT
to a typedefTd
wrappingT
, where the attached scope ofTd
contains the method.Typedef provides a form of "method extensions" (#1170), but without introducing any confusion to where a method is defined. It's defined either in the base type, or in the typewrap. Case closed. If methods from the base types are to be included in the typedef, they have to be included by explicit delegation.
It is logical for the "syntax view" of a type to mirror the type (in terms of structure) as closely as possible, and especially for a systems language like zig, but there are times when a different syntax view could be significantly more ergonomic or safe.
Typedef could be a way to offer alternative "syntax views" in zig, either by disabling syntax or enabling syntax sugar depending on the application.
2: Coercion control
Looking at typedef coercion through the lens of the shorthand
TD(T,N,B,S,C)
, withn,m,l != 0
, none ofn,m,l
being equal , andx
reads as "not considered".As can be seen above, the hierarchical relationship between typedefs can be determined by looking at the entries
N
andB
of the shorthand, and then typedef coercions can be categorized in various ways:From that, you can have various distinct type behavior based on typedef. For instances
tdd
of typetypedef(T,.Distinct)
: Whenever a coercion to or fromtdd
is requested, accept or reject (compile error) depending on the "category" of the given coercion.A coercion rule set that maps almost exactly to this comment in the distinct types issue #1595, is:
Different rule sets can of course be applied to different use cases.
2: Reflection on typedef configs
Just like normal types support comptime reflection through
@TypeInfo
, so could typedef support comptime reflection on the config payload.This would allow comptime tags to be implemented:
To add tags to a typedef as opposed to normal type, one would have to resort to typedef nesting.
4: Additional zig built-in types leveraging typedef
There are many builtin primitive types in zig that get special treatment with respect to syntax:
.len
for arrays and slicesSome primitive types (primitive in the syntax support sense) are arguably even generic, like
[N]SomeType
, or?SomeType
.The idea to be presented here is that a typedef that wraps a primitive base type will be very predictable for the compiler to deal with. The base type is a primitive, and the config payload is standardized because it is a builtin union member.
Definition of a typedef primitive
Tp
:T
wrapped byTp
must be a primitive zig type, like bool or a numeric array.Tp
adds enough context toT
for the compiler to treatTp
as a "new" type.Tp
is treated as a zig primitive in the sense that syntax can be unavailable for normal user defined types, but still be available for instances ofTp
.Tp
is "memoized". Two typedef declarations with the same base type and config payload will not be distinguishable at comptime.typedefprimitive
, which starts of as an alias fortypedef
, but differs in the following ways:typedef
, and others only withtypedefprimitive
.Mathematical types can often be modeled as numeric arrays, thus defining additional mathematical "primitive" types in zig could in many cases be done by wrapping a numeric array in a typedef and attach metadata (the typedef config).
Example, complex numbers:
Having arithmetic operators working for a complex number "typedef" is IMO perfectly feasible, as per the reasoning given in this issue comment by @skyfex. The functions or procedures necessary for complex number arithmetic does not need any control flow or to return any errors, could be defined as built-in functions, and cannot really be compared to operator overloading where everyone is able to customize how arithmetic operations behave in a multitude of ways.
The compiler would have enough information available to handle the cases where one or both of the operands in an arithmetic expression is a complex (typedef) primitive, and insert instructions corresponding to the appropriate builtin function.
This principle extends to fixed point types and matrices also, even despite the fact that arithmetic operation on these (mathematical) types may return a result with a different type than the type of either or both of the operands.
Matrix4x4 * Matrix4x3
yieldsMatrix4x2
. Still, this is unproblematic. The matrix dimensions are attached to the operands in form of the typedef config payload. Because of that, the compiler can ensure that the operands are compatible and determine the correct output "typedefprimitive".In the codegen stage, these mathematical types would just be arrays.
5: Comptime validation of the specified base
Typedef is a comptime feature, so whenever a typedef is declared some comptime code can be triggered to run. This could include various validation functions that could be defined as builtins, basing it all on the existing reflection capabilities in zig.
Examples could be checking that a struct is a superset of another (embeds), or that the fields in a struct follow the convention set by a typedef config. Example:
When the typedef config is
.Bitflag
, the compiler may enforce the existence of top level declarations with type equal to the base type, that each must have a unique value, that together must fully "cover" all the bits of the base type (which obviously has to be a uint), and where only one of the declarations can be something else than a power of two (the spare).Additionally, as a bitflag could fit the definition of a typedef primitive established earlier, syntax sugar could be enabled, so that for example
widgetAnchors.LEFT
would map towidgetAnchors & WidgetAnchorFlags != 0
. With primitive types (whether typedef based or not), syntax can be adjusted to improve usability.Discussion and questions for the zig community:
@Vector
(and maybe the proposed@Matrix
) into typedefs wrapping numeric arrays while still having them map down to LLVM in codegen?