Open mlugg opened 1 year ago
it seems fine to me for fn (comptime T: type) T and fn (comptime T: type) id(T) to be considered distinct
Consider this case:
const std = @import("std");
const ArrayList = std.ArrayList;
fn foo(comptime T: type) ArrayList(T) { ... }
fn bar(comptime T: type) std.ArrayList(T) { ... }
The functions foo
and bar
would now have different types, because ArrayList
is referred to by two different names.
This pattern of renaming types for convenience is common enough that I believe it's important to address.
This becomes even worse when you consider the inverse case, where same name refers to different things in different scopes:
const x = struct {
const G = std.ArrayList;
fn foo(comptime T: type) G(T) { ... }
};
const y = struct {
const G = std.StringHashMap;
fn bar(comptime T: type) G(T) { ... }
};
x.foo
and y.bar
obviously have different types, yet they would be considered identical under a naive IR comparison.
Another complaint I have with the proposed solution is that fn (comptime T: type) T
doesn't ever declare T, so how can it be in scope for the return value?
In a function definition, T
is declared at the point where the argument is declared, but in a type it would have to exist only within that type.
This is more of a nitpick than anything else, but the semantics feel a little bit strange.
The functions
foo
andbar
would now have different types...
Hm, yeah, I hadn't considered these issues with a naive IR comparison, so I guess that complicates things. The issue here just boils down to function aliasing - trivially aliased functions and types should be considered equivalent, but an IR comparison wouldn't do that.
What we ideally want here is some kind of method to partially evaluate the ZIR so that such aliases are resolved before comparison. To that end, I need to look into the specific structure of the ZIR to determine if this is easily possible.
`x.foo` and `y.bar` obviously have different types, yet they would be considered identical under a naive IR comparison.
I'm not familiar with the exact structure of ZIR, but... would they? I thought the ZIR would refer to the constant by a different index here (i.e. this reference is resolved at the AstGen
stage), but I could be wrong.
Either way, I can't answer these two points properly until I've had a more in-depth look at the ZIR structure, so I'll stop talking for now and respond again once I've looked more deeply at how it could work.
Another complaint I have with the proposed solution is that
fn (comptime T: type) T
doesn't ever declare T, so how can it be in scope for the return value?
Why can't this be considered a declaration of T
for the scope of that expression, like how it works for function definitions but without the associated block? That doesn't sound too unreasonable to me. (If #1717 ever gets in, this bringing-into-scope will apply to function definition expressions, and I don't think there have been any complaints about the proposal to that end.)
Comparisons between generic function types should use the internal representation and not the lexical expressions, obviously. That should eliminate any issues with namespacing and trivial renamings.
More problematic are comparisons involving nontrivial type aliases like ArrayList(T)
, which expands to ArrayListAligned(T, null)
. Unfortunately, the compiler has no way of verifying that this is true for all T
. It can only do so for concrete instantiations. This is essentially the same problem that prevents broadening the scope of inference in #9260. I don't think it invalidates this proposal, though. It just means that type checks involving nontrivial aliases can only be performed lazily.
BTW, xref #6997, which proposed a more limited version of this.
Comparisons between generic function types should use the internal representation and not the lexical expressions, obviously. That should eliminate any issues with namespacing and trivial renamings.
Why would that solve the issue? ArrayList(u8)
would still refer in the ZIR to the local ArrayList
decl rather than std.ArrayList
. This becomes more obvious if you think of a slightly more complex declaration such as const ArrayList = if (true) std.ArrayList else undefined;
.
@mlugg
Why would that solve the issue? ArrayList(u8) would still refer in the ZIR to the local ArrayList decl rather than std.ArrayList.
That only means that type analysis needs to be performed at some appropriate later stage. But that wouldn't be any different from ordinary type checking. E.g., if you have a function like this
fn square(x: i32) Int {
return x*x;
}
test "foo" {
const five: i32 = 5;
try std.testing.expect(square(five) == 25);
}
you cannot tell whether it type-checks without looking at the actual definition of Int
, which may be imported or locally defined, renamed, or even conditionally defined somewhere. If we can do this for concrete functions, why not for generic ones as well?
This becomes more obvious if you think of a slightly more complex declaration such as const ArrayList = if (true) std.ArrayList else undefined;
That's another example of a non-trivial type alias. This particular case has a trivial comptime condition and will probably be resolved into std.ArrayList
, but in general such type checks can only be performed lazily. But once again, this is true for both concrete and generic types.
Edit: Fix mistake in code snippet.
Having thought about this some more, I don't believe there's a good way for my original design to work. So I'm going to retroactively adjust this issue to focus on my alternative solution.
Even aside from the fact that it's generally inconsistent (#5893), the term anytype
is hugely misleading in the context of generic function types. In most cases, it doesn't actually represent "any type", but instead a generic type, known only to a particular instantiation of the function. anytype
(in the sense of a function definition, i.e. literally "any type") is a subset of that behavior, but in general it's a very different thing.
Therefore, the term anytype
in this context should be replaced with generic
. In addition, this should become a real keyword in the language for writing function types. Note that this only applies to function types, not definitions: those still use standard anytype
and backwards references to define parameter and return types. In a function type, generic
more clearly indicates to the user that the parameter type is unknown, and makes a clear distinction from anytype
in the function definition sense.
Note: this proposal combined with #9260 would eliminate the term anytype
from the language altogether.
Therefore, the term
anytype
in this context should be replaced withgeneric
.
How about the keyword dependent
instead? I think it signifies much more clearly the fact that the unknown-ness of the type is because it's most likely dependent on previous arguments' types.
True, the name is not perfect for the times where the anytype
argument is caused by the definition using anytype
, and not an expression referring to a previous argument - but even in these cases, often times there is some intended relation (i.e. dependency) between the types expected to be passed in. For example:
std.testing.expectEqual
declares two anytype
parameters, but expects that they coerce into a common type.std.fmt.format
declares args
as anytype
, but expects it to be an aggregate type structured in a certain form, depending on fmt
.
EDIT: I believe my original proposal here was infeasible, so have adjusted it to a smaller (and fairly simple) change.
Updated Proposal
Tangentially related: #9260
Problem
Currently, generic functions have very strange-looking types when you introspect them.
The type being printed here is nonsensical -
anytype
isn't a valid return type! You cannot write the type of this function in standard Zig without doing something like@TypeOf(foo)
. And besides,foo
can't return "any type" - it's limited to the type of its input. This is especially weird when you realise that it makes function types which are definitely different technically identical:Because both of these functions are generic, they're considered as having the same type, despite definitely being different.
This gets even weirder when we look at functions where parameter types depend on other parameters:
This one is extra misleading, because the type printed is entirely syntactically valid, but it's not the same thing as
bar
's real type in practical terms. That means you can pass functions in contexts where they don't actually make sense, like this:The summary here is that Zig treats any generic type as
anytype
in a function type, even if that type is much more constrained. This has several negative consequences:fn (comptime T: type) T
isn't considered a valid type outside of a function definition, andanytype
can't be written as the return type. You have to resort to dirty hacks like@TypeOf
on a dummy function.anytype, anytype
case above.Proposed Solution
Make
fn (comptime T: type) T
et al real types.The compiler clearly already tracks this information somewhere (otherwise, the last example above would have much more disastrous consequences). Exposing it as a part of the type makes the type system both more intuitive and more practically useful. Function types would coerce to any valid full or partial instantiation; e.g.
fn (anytype, anytype) void
coerces fine tofn (u32, anytype)
, orfn (anytype, u32)
, orfn (u32, u32)
.This wouldn't necessarily require function types to have specific names associated with them. Instead, this could be represented internally with some kind of index or pointer, and the required parameters could just be given generic names (
x
,y
, etc) if the type is printed via@compileLog
.Comparisons between generic function types do have one strange property: they require equality comparison between expressions. For instance,
fn (comptime T: type) std.ArrayList(T)
isn't the same thing asfn (comptime T: type) std.StringHashMap(T)
, and we can only detect this equality by actually comparing the return type expressions. This is admittedly a bit strange, but it doesn't raise any issues I can immediately think of - we don't need to do any evaluation (it seems fine to me forfn (comptime T: type) T
andfn (comptime T: type) id(T)
to be considered distinct). I assume this would basically consist of checking equality between some blocks of ZIR, which doesn't sound too complex, but I could be missing something.The main sticking point in my mind in terms of language usage is how
@typeInfo
and@Type
(related: #10710) operate under this system. The latter is a non-issue; construction of generic function types atcomptime
is already disallowed, and I don't really see a reason to change this (it's a niche within a niche, and probably indicates overly complex comptime logic). In terms of@typeInfo
, any solution I can think of to expose the info feels overcomplicated and not very practically useful (sophisticated in-language checking of generic function types is, like, a niche within a niche within a niche), so I don't really think we need to do that - the current structure (where we just hide the types of generic parameters and return types) seems okay to me. I did think about having something like anInstantiate: fn (args: []const *const anyopaque) type
for generic function types, but that's equivalent to a@TypeOf(func(params))
, so seems pretty pointless. (One question: why isstd.builtin.Type.Fn.Param.is_generic
a thing whenarg_type
is already optional? Is there not a one-to-one correspondence there?)Other Options
In my eyes, the main practical issue with the status quo is the inability to write functions with generic return types. We could just make
anytype
a valid return type when writing function types (although this feels weird if #9260 gets in (:crossed_fingers:), as that proposal otherwise eliminatesanytype
from the language). This is certainly a simpler solution implementation-wise, but it's still quite unintuitive for users.We could further improve that by instead introducing a new keyword - maybe
generic
- that indicates an unknown generic type which isn'tanytype
. That would make the signatures much more intuitive to read, and slightly more specific - for instance, the type offn id(x: anytype) @TypeOf(x)
becomesfn (anytype) generic
.