ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
https://ziglang.org
MIT License
33.48k stars 2.45k forks source link

Proposal: std.math.{min,max}Int return T #20574

Open mlugg opened 1 month ago

mlugg commented 1 month ago

std.math.minInt and std.math.maxInt are useful functions for finding the bounds of an integer. They have a few use cases, with the following being the most common:

Looking through uses of these functions in std, most usages appear to align with one of the two use cases above. In particular, the first use is incredibly common, particularly for maxInt; it is used in std.Progress, std.zig.Zir, std.c, and so on. Unfortunately, particularly for this use case, the signature of maxInt can sometimes be awkward. Suppose you are trying to @bitCast the result of maxInt to a packed struct(u32) to act as a special value (I have personally tried this a few times in the compiler codebase). Then, since maxInt returns a comptime_int (rather than a u32), you can't do so without an @as coercion, i.e. @bitCast(@as(u32, std.math.maxInt(u32)))!

Proposal

Change the signature of minInt and maxInt to pub inline fn minInt(comptime T: type) T. The inline annotation preserves the behavior that the result is comptime-known, so this change is mostly transparent to use sites.

From a quick-ish look through all uses of maxInt (which is used by far more frequently than minInt) in the standard library, I don't think this would break a single use case -- for instance, it doesn't impact the "in-bounds" checks mentioned previously. Meanwhile, it would improve the aforementioned use case of passing the value to @bitCast. This change also has the minor advantage that it acts as implicit documentation that the result is within the integer's range; i.e. that maxInt(u32) is (1 << 32) - 1 rather than 1 << 32.

This seems more-or-less like a no-brainer to me.

InKryption commented 1 month ago

Just a thought: if #3806 were to be implemented, would it make more sense to change the functions to essentially look like this?

pub fn minInt(comptime T: type) @Int(@typeInfo(T).Int.min, @typeInfo(T).Int.min + 1) {
    return @typeInfo(T).Int.min;
}

pub fn maxInt(comptime T: type) @Int(@typeInfo(T).Int.max_exclusive - 1, @typeInfo(T).Int.max_exclusive) {
    return @typeInfo(T).Int.max_exclusive - 1;
}

(inline intentionally not added, since the types have OPV)

rohlem commented 1 month ago

Since maxInt returns a comptime_int (rather than a [T]), you can't do [a @bitCast] without an @as coercion.

What would be the harm in simply allowing a @bitCast from comptime_int to an integer result type? I don't currently see any drawbacks to that; I can't imagine a scenario where you would expect a different result without the @as.

The inline annotation preserves the behavior that the result is comptime-known, so this change is mostly transparent to use sites.

I'm currently working with comptime- and reflection-heavy integer code, and I doubt this would work for my use case. comptime_int is a type that guarantees the value has no runtime bits; integer types with more than 0 bits do. You can f.e. have a function signature fn foo(x: anytype) if(x == comptime_int) R(x) else RT(@TypeOf(x)) {...}, and the compiler allows it in status-quo. If the type is coerced into a type with runtime bits, the mechanism no longer applies. A workaround would be a builtin @guaranteedComptime(x) with special semantics around inline values, but imo this proposal neither justifies such complexity compared to its value, nor limiting the flexibility of status-quo in this regard.

(Vaguely related side note, from experience an inline fn hasNoRuntimeBits(T: type) bool {return @bitSizeOf(T) == 0;} does not allow comptime-branching in function return type expressions in status-quo, so the compiler errors. Maybe that is to be considered a bug. Could no longer reproduce, my bad.)

All that being said, implementing minInt and maxInt in userland is rather trivial, so even if I think the change would be detrimental, I wouldn't personally be affected.

mlugg commented 1 month ago

What would be the harm in simple allowing a @bitCast from comptime_int to an integer result type?

This phrasing doesn't make sense, as it talks about bitcasting to an integer, i.e. comptime_int to u32 but not to packed struct(u32). I'll assume you meant "coerce it to an appropriately-sized integer before the bitcast". In that case, trivial counterpoint: @bitCast(@as(f32, 123)) is very different to @bitCast(@as(u32, 123))! Ignoring the float case, such a coercion would make types a lot less explicit, and could lead to bugs. More generally, allowing this would just not be logical. comptime_int doesn't have a size in bits or an in-memory representation; let's not pretend it does.

I'm currently working with comptime- and reflection-heavy integer code, and I doubt this would work for my use case.

Could you elaborate on your use case? I'm struggling to understand these two paragraphs. I don't understand what your hypothetical builtin @guaranteedComptime is supposed to do. To be 100% clear, marking a function as inline and ensuring its result is comptime-known -- such as by wrapping the entire function body in a comptime block -- is always supposed to act like the call-site was marked comptime. If the result ever appears to be runtime-known, that is a compiler bug. I don't quite get what you were saying about hasNoRuntimeBits, but it sounds like you might have experienced such a bug there?

rohlem commented 1 month ago

I'll assume you meant "coerce it to an appropriately-sized integer before the bitcast". In that case, trivial counterpoint: @bitCast(@as(f32, 123)) is very different to @bitCast(@as(u32, 123))!

Right, I didn't mean including floats as result or intermediate types (I forgot comptime_int is allowed to coerce to them, I would have used @floatFromInt tbh), only integers and integer-backed result types like packed struct and packed union. (Maybe enum if @bitCast to it were allowed, iirc it isn't.)

Such a coercion would make types a lot less explicit, and could lead to bugs.

I can't think of a case where it would lead to a bug.

comptime_int doesn't have a size in bits or an in-memory representation; let's not pretend it does.

No, but it trivially coerces to integer types which do. In my mind these integer types are the canonical bit representation for comptime_int, because I can't think of a different representation that would make sense for them. (Again, I didn't think about floats.)


[M]arking a function as inline and ensuring its result is comptime-known [...] is always supposed to act like the call-site was marked comptime. If the result ever appears to be runtime-known, that is a compiler bug.

Thank you for the clarification, I'll open a separate issue then. EDIT: Can no longer reproduce, I either misinterpreted compile errors back when I encountered it, or it has been fixed since then, sorry for the misreport.

I don't understand what your hypothetical builtin @guaranteedComptime is supposed to do.

Return a bool of whether a given value is currently comptime-available (from being inlined into a comptime callsite) or not. As the proposed new return type has runtime bits, I think it would otherwise be impossible to choose the return type based on comptime-ness of the argument.

Could you elaborate on your use case?

Here's a demonstration of the difference (even though I'm pretty sure we generally don't like using Zig types to encode comptime-ness this way, so the real solution will instead be #3806):

/// returned type has no runtime bits
fn OPV(x: anytype) type {
    return struct {
        pub fn get(_: @This()) @TypeOf(x) {
            return x;
        }
    };
}
fn RuntimeValue(X: type) type {
    return struct {
        data: X,
        pub fn get(self: @This()) X {
            return self.data;
        }
    };
}
/// returns an instance of OPV if x is comptime-known, otherwise an instance of RuntimeValue
fn wrap(x: anytype) if (@TypeOf(x) == comptime_int or @TypeOf(x) == u0 or @TypeOf(x) == i0) OPV(x) else RuntimeValue(@TypeOf(x)) {
    if (@TypeOf(x) == comptime_int or @TypeOf(x) == u0 or @TypeOf(x) == i0) return .{};
    return .{ .data = x };
}

fn runtimeTest(v: anytype, expected: comptime_int) void {
    if (v.get() != expected) unreachable;
}
fn comptimeTest(v: anytype, expected: comptime_int) void {
    comptime if (v.get() != expected) unreachable; //.get() is comptime-available for OPV, but not for RuntimeValue
}
test {
    const maxInt = @import("std").math.maxInt;
    const max0 = maxInt(u0);
    comptime if (wrap(max0).get() != 0) unreachable;
    const w0 = wrap(@as(u0, max0)); //wrap can return OPV because its argument is of a type without runtime bits
    runtimeTest(w0, 0);
    comptimeTest(w0, 0);

    const max8 = maxInt(u8);
    comptime if (wrap(max8).get() != 255) unreachable; //allowed - wrap returns an OPV; it can tell its argument is comptime-known by its type
    const w8 = wrap(@as(u8, max8)); //wrap has no way to return OPV instead of RuntimeValue for comptime-known argument value
    runtimeTest(w8, 255); //checking at runtime is allowed
    //comptimeTest(w8, 255); //not allowed, re-enable for compile error
}