Open marler8997 opened 3 years ago
The syntax
[_]u8
was chosen in case we come up with a use case for a more general[_]T
syntax.
There is an obvious use case for such syntax: representing arbitrarily-sized comptime arrays of arbitrary types. There is no reason to limit ourselves to just [_]u8
, since the same logic would apply to e.g. [_]u32
or [_]comptime_int
.
@zigazeljko yes that's why I chose the [_]u8
syntax in case we want to extend it. However I haven't thought of an actual use case for this yet, do you know of any?
To bring up a use-case for arbitrary T
s in the proposed [_]T
syntax, there does happen to be a bit in a project where I see it being beneficial: vulkan-zig, here. The generated {}Wrapper
type functions take as a parameter a comptime slice of an enum with a matching name {}Command
; this has the effect of instantiating multiple instances of the type for equivalent slice content, because of the described characteristics of comptime; with a change like described, the situation there would be greatly improved.
I propose that Zig add support for accepting "comptime variable-length string literal values" with the following syntax:
comptime s: [_]u8
would be a "comptime array" as opposed to the current convention which is to use "comptime slices" (i.e.comptime s: []const u8
). There are important semantic differences between "comptime slices" and "comptime arrays". Comptime slices carry with them extra information, namely, the "memory region" they are pointing to. Two "comptime slices" that contain the same content but come from different memory regions are not the same. One reason for this is that code can access memory outside the bounds of a slice so long as it stays within its containing "memory region".The "extra information" that comes with a "comptime slice" can be problemantic because of the nature of comptime. Unlike a runtime function which is only instantiated once, a comptime function must be re-instantiated for every unique set of parameters it is passed. This means that the "extra information" that comes with a "comptime slice" about the memory region causes it to instantiate a new function even if its not being used. This caused an infinite recursive instantiation loop in
std.fmt
(see https://github.com/ziglang/zig/issues/7948).There is a proposal to mitigate this problem by "de-duplicating" const comptime slices
https://github.com/ziglang/zig/issues/7948#issuecomment-844635939
However, it does not solve the cases where the slices do actually come from unique memory regions or where they are mutable. This solution must handle odd corner cases and puts some complicated constraints on the language such as ensuring that all unique string literals have their own distinct memory region. IMO, it's a complicated solution that is hard to justify given that it still doesn't solve the problem in many cases.
I believe the simplest solution is clear when we consider what the developer's original intent is. In most cases, the intent is for the function to only be instantiated once for each unique string based on its content, not the memory region it comes from. Zig already has a way to represent this intention, namely, with "arrays". The problem is the ergonomics of accepting arrays.
One set of functions that fall into this category are the functions in
std.fmt
. I have created 2 alternative PR's that modify theformatType
function to takefmt
"comptime slices" and convert it to a "comptime array" before analyzing the rest of the function.https://github.com/ziglang/zig/pull/8839 https://github.com/ziglang/zig/pull/8846
In the first PR, I create a wrapper function that just takes the comptime slice and forwards it to the real function as a comptime array.
And in the second,
formatType
acceptsanytype
and detects whether it got slice, and if so calls itself recursively after it converts that slice to an array:Each solution has its pros and cons. The first solution requires that every function create a wrapper function around their real function. This violates the principle that we want to make it "easy to write the correct code and hard to write the incorrect code". Since it's easier not to create a wrapper function, and it still works in some cases, it's likely it won't be done correctly a lot of the time. The seconds solution is smaller but falls victim to the same problem and introduces an additional drawback in that the
fmt
argument use the underspecifiedanytype
in its signature instead of an explicit comptime string type.With the proposed feature, we can avoid both of these problems. The correct code is now easy to write and we can still specify what type we are expecting in our signature:
TypeInfo
For now the
[_]u8
type will be behave likeanytype
when it comes toTypeInfo
. Its arg type will benull
unless we find reason to enhance TypeInfo to represent it.P.S. The syntax
[_]u8
was chosen in case we come up with a use case for a more general[_]T
syntax. If we determine that such a general case is unwanted, then something likecomptime_string
would also be fine. This also leave the possibility for[_:0]u8
.