Closed AndreaOrru closed 5 years ago
Previously when @andrewrk and I considered this syntax very early in Zig's development, one problematic point was the upperbound inclusivity/exclusivity. With an expression like i < n
, there's no question that n
is an exclusive upper bound, but with a...b
, does that include b
?
Surely the question can have an answer, and it should probably be exclusive, but the point is that the syntax doesn't clearly say that it's exclusive. And just to confuse things, I've seen languages (coco, for example) use two different syntaxes for a..b
inclusive and a...b
exclusive.
I don't think users will have a very good success rate for guessing whether the upper bound is inclusive or exclusive, which makes me dislike this proposal. That being said, iterating over numbers 0
to n
(exclusive) is a very common pattern, even outside the usecase of array indexes, and I think this deserves more discussion.
I wouldn't be against the two different tokens, but intuitively I'd say a..b
is b exclusive and a...b
is b inclusive. I think Rust and Wren do this too...
If we only keep ...
, I'd say keep it exclusive and consistent with how slices are made, ie array[0...1] has array.len == 1. Python's range() function has this property
We do have the [a...b]
slicing syntax, which is exclusive. So it should be clear from this.
On the other hand, an unsolved problem is that in switch statements, ...
is inclusive, for example:
switch (c) {
'a'...'z' => {}, // inclusive
}
I have 2 ideas to solve this problem:
...
syntax for slicing to ..
and keep it exclusive. Keep ...
syntax for switch statements, and keep it inclusive....
syntax for slicing to :
(like Python). Keep ...
syntax for switch statements, and keep it inclusive.Change ... syntax for slicing to .. and keep it exclusive. Keep ... syntax for switch statements, and keep it inclusive.
I'm not sure how I feel about using the :
as a slice. It always has something to do with types. ..
and ...
are more intuitively associated with ranges
I posted this in #359, just adding the relevant part here, which suggests keeping the two range operators ..
and ...
and add a third :
that reflects the number of elements in the range as opposed to its start and finish
for (0 .. 2 ) | x, i | { } // Exclusive -> 0, 1
for (0 ... 2) | x, i | { } // Inclusive -> 0, 1, 2
for (2 : 2) | x, i | { } // Range -> 2, 3
i haven't seen mention of it yet, so i'll just do so myself:
is there a way to count backwards? will for ( 2 ... 0) | x, i | { }
count 2
, 1,
0`?
It was mentioned before, the expected behaviour for that statement would be it would actually loop 0 times - ie, the block would not execute.
To count backwards, you'd do for(a...b) | x, i | { print(b - i) }
I dislike this proposal,it is difficult to know it's meaning without read the document. A function call may be better.
Care to elaborate? Which document? What function call? Can we see an example?
from @Hejsil
not the most elegant because you have to type |_,i|
instead of |I|
but still I'd say usable
question is if the same can be hacked together for ranges not starting at 0
I dislike this proposal:
for (a...b) |x, index| {
...
}
what a and b mean? is a include? is b include? what is x mean? what is index mean? If the lang design like this, I have to read the document to understand what it means.
I am looking for something more understandable syntax, may be like this?:
for (var i = range0ToNotInclude(100)) {
}
this gets you most of the way there:
test "" {
var j: usize = 0;
for (times(10)) |_, i| {
@import("std").debug.assert(i == j);
j += 1;
}
}
fn times(n: usize) []const void {
return ([*]void)(undefined)[0..n];
}
We already have the concept of ranges in slices and in switch statements - for this proposal to be reasonable, we just need to be consistent throughout the language. I know we're optimizing for readability, but we should be able to expect that a person reading Zig code has at least looked through the Zig docs.
That times function returning a []const void
is a reasonable work around.
In @bronze1man's proposal, an assignment would have to be an expression that returns whatever was assigned. It might also be unclear that the function only runs once - a reader might expect each iteration of the loop to declare a new variable i set to the result of a new call.
Zig had a "null-unwrap-and-declare" operator ?=
in ifs that was replaced precisely to keep identifier declaration consistent: control (predicate) | body_names | { body } prong | prong_names | { body }
// So we could do this:
if (maybe_foo()) | foo | { foo.bar() };
// Instead of
if (var foo ?= maybe_foo()) { foo.bar() }
I think there is a number of things that needs to be thought of here:
If for loops work on slices, arrays and ranges, it's starting to get confusing. What's the pattern? Could it be generalized in a meaningful way?
I gotta say I'm a big fan of languages where the for loop accepts some kind of "iterator", rather than a few special types. It makes it much more clear and explicit what's going on, and easier to read in my opinion. So if you have an array items
, maybe it should something like:
for (keys(items)) |key| { ... }
for (values(items)) |value| { ... }
for (pairs(items)) |key, value| { ... }
And then for ranges you could do:
for (range(1..3)) |num| {...}
for (rangePairs(1..3)) |index, num| { ... }
But how you make those iterators work is a huge proposal on its own.
I gotta say I'm a big fan of languages where the for loop accepts some kind of "iterator", rather than a few special types.
if only there were something called interfaces ore alike 🥇
Iterators can be done with while
instead of for
.
https://github.com/ziglang/zig/blob/02713e8d8aa9641616bd85e77dda784009c96113/build.zig#L155
compare
var i = 0;
var it = get_some_it (thing); // this code may actually look different every time so you never know what you read
while(it.next()) |arg| {
i++;
}
to
for(thing) |arg, i| {
}
I do not think the first one enhances clarity.
And iterating is one of the most common things so there is that ...
with interfaces its possible to have concise iterators
Iterators can be done with
while
while
can be done with recursion https://softwareengineering.stackexchange.com/a/279006
sure but whats the point? (apart that recursion is currently not working and should be avoided)
@skyfex
[1,3) and [1,2]
I can't even remember those two (and they look very similar as well) while I find the ruby syntax intuitive so it really depends on the person...
In the end you just have to remember some syntax so I do not think this is actually such a big deal.
The issue is a very restricted for loop and not a .
vs a (
.
👍 for : without a variant form since the intent is very clear and unambiguous among a number of languages. Might be nice to add optional stride too ;)
I'm not sure it's a good idea to couple syntax with the names that are defined in a struct - saying you can use for
on types that have iterate()
and next()
this is basically the same thing as operator overloading.
Iterators can be done with a while
Yep. The while iterator pattern is both explicit and concise - if you end up having to change how the iterator has to be initialized or continued, the function calls aren't hidden behind the syntax.
If the value of this proposal is good, and the only concerns are regarding syntax, we could accomplish this with built in functions and call it as for(@range(u8, 'a', 'z')) | c, i | { }
Example names and signatures. If it's hard to work out what they mean, it's a sign we need better functions.
@times(a: var) []const void;
@range(comptime T: type, a: T, b: T) []const T;
@sequence(comptime T: type, a: T, b: T, stride: T) []const T;
@linearSpace(comptime T: type, a: T, b: T, num: int) []const T;
Can built in functions be async/generators?
@generateRange(comptime T: type, a: T, b: T) yield T;
@raulgrell I think if those built-ins end up being needed, it's a failure of language design. Built-ins should be functionality that can not in any way be implemented with library code. There are many of ways to design the language such that these can be implemented as plain code rather than magic built-ins, and I'm sure one of them can keep Zig conceptually simple and explicit.
I agree that there shouldn't be special function calls generated by the for
syntax though.
I don't really like having to use while-loops to use an iterator pattern, but when I think about it, it's probably the correct choice for Zig.
Is there a proposal for generators (or observable or whatever it should be called)? It would make a lot of sense to extend the async support to allow an async function to yield multiple values. Then it would also make sense to extend for-loops to support those.
Yay for generators. Then you do not need special .. syntax.
How is a generator any different from an iterator?
You can super easily make an iterator (with next) that generates a range of numbers for you.
The only thing is that you need to type more than just writing it directly into the loop yourself. So... 🤷🏻♂️
@BarabasGitHub Generators, as I meant it, would be asynchronous. Iterators are not.
While iterators operates on some state object that you manually initialize and keep track of, generators would just be a single function and the state would be its stack.
It doesn't make sense to add support for generators just to support for
on a range. But if support for generators is added to support more advanced async programming, then making the for loop support generators and defining range as a generator would make sense.
I found that @andrewrk has commented on supporting generators here: https://github.com/ziglang/zig/issues/1194
The problem with this, is that if the use of generators in a for loop isn't heavily optimized by the compiler, the use of generators for simple things like range would be extremely inefficient.
That could be OK though, you can always use while
for efficiency.
This proposal is about making it more syntactically convenient to loop over a range of numbers than:
var i = a;
while (i < b) : (i += stride) {
}
or
// times() is implemented in an above comment
for (times(10)) |_, i| {
}
Can someone link to some existing code that would benefit from this convenience, even code in a different language that uses syntax like what is being proposed here? I'm questioning how common it is to have a
and stride
values other than 0
and 1
respectively. I know iterating over an array/slice backwards is useful sometimes, and the OP mentions a range of chars. I think it would help this discussion to look at some real actual usecases.
If the usecase is not sufficiently common, then you can always just use the above code, and we don't need a special language construct.
Another simpler option that bypasses the range syntax issue but still offers some convenience: iterating n times with for (n) | i | { }
. I'd argue the automatic scoping of i
and removing the line declaring it is good for readability by reducing noise and preventing accidental reuse of the variable. But then again, this is a much weaker language feature that might violate the "one obvious way to do things" rule with very little expressive benefit.
Whether the range/number syntax is "communicating intent precisely" is probably a bit subjective - i think it communicates a counter better than a while loop where we just recognize the pattern of checking a number and updating it. But @andrewrk and @thejoshwolfe have been pretty clear about hidden allocations and hidden control flow being the devil - so there is also a strong argument for really making you allocate that variable on the stack and do all the comparisons and assignments explicitly, where you communicate intent with a precise implementation.
But then I wonder if it really is more precise or if the optimizer is just going to change the implementation anyway, and whether it would choose a better implementation than us if we give it more information. But I don't know enough about compiler internals to really comment on this.
EDIT: Removed stuff about generators, hadn't seen #1194
You had a very good point in the part of the comment you removed @raulgrell.
while
communicates that the expression inside it is evaluated/called multiple times.
for
communicates that the expression is evaluated once, but it yields multiple values, either as a slice, an array or maybe one day a generator.
In that way, maybe for (a..b)
is a bad idea. The range a..b
itself doesn't yield or contain multiple values. It's just two numbers.
When I think about what other things in the language for
could be used on, I'd say that for
on structs would be pretty neat for introspection. But then you'd need built-in support for boxed values (some kind of Any type)
I removed it since generators are a bit off-topic and as @thejoshwolfe mentioned, this issue is more about the a..b
syntax. Also, @andrewrk provided a pretty solid solution for generators in #1194 which makes my point less of an issue.
This is what I removed from my post with some explanation as to why:
Regarding generators, I understand them as functions that can be called several times and return different things depending on its internal state. It makes more sense to use them in a while, otherwise we'd need to have a special case for generator functions being called, or make the syntax for calling generator functions different to calling regular functions.
while (genRange(2, 5)) | i | {
//genRange gets called every iteration, yielding one value into n each time
}
for (genRange(2, 5)) | n, i | {
// suggests genRange gets called once
}
Using his syntax, you 'initialize' the generator and then use it. If we were to generalize for so that you can use it with generators, the syntax wouldn't have the issue I was pointing out. Imagine something like below that allows the @coroFrame
and resume
implicit:
test "while generator" {
const items = try async<std.debug.global_allocator> range(0, 10);
defer cancel items;
// Instead of
while (@coroFrame(items).*) |n| : (resume items) {
std.debug.warn("n={}\n", n);
}
// Do this
for (items) | n | {
std.debug.warn("n={}\n", n);
}
}
Since items
isn't a function call, it's more like iterating over the values in an array/slice. But instead of a block of memory, it's pointing at wherever values are being generated. But none of this is real =)
Not sure if this is helpful, but you could implement ranges as a lib that outputs a "almost free at runtime" comptime array on the stack like so:
const std = @import("std");
const assert = std.debug.assert;
const warn = std.debug.warn;
fn abs(comptime x: comptime_int) comptime_int {
return if (x < 0) -x else x;
}
pub fn range(comptime inclusive: bool, comptime a: comptime_int, comptime b: comptime_int, comptime step: comptime_int) [@divTrunc(abs(b - a), abs(step)) + if (inclusive) 1 else 0](if (a < 0 or b < 0) isize else usize) {
comptime assert(abs(a) != abs(b));
comptime assert(abs(a + step) != abs(b));
comptime var res: [@divTrunc(abs(b - a), abs(step)) + if (inclusive) 1 else 0](if (a < 0 or b < 0) isize else usize) = undefined;
comptime var idx = 0;
inline while (idx < res.len) {
res[idx] = a + step * idx;
idx += 1;
}
return res;
}
test "static ranges" {
warn("\n");
for (comptime range(false, 10000, 60000, 10000)) |val| {
warn("{}, ", val);
}
warn("\n");
for (comptime range(false, -1, 7, 2)) |val| {
warn("{}, ", val);
}
warn("\n");
for (comptime range(false, -120, 77, 80)) |val| {
warn("{}, ", val);
}
warn("\n");
for (comptime range(true, -9, 13, 1)) |val| {
warn("{}, ", val);
}
warn("\n");
for (comptime range(true, 9, -12, -3)) |val| {
warn("{}, ", val);
}
}
(wrote this ~6 months ago, probably has some issues)
I propose this syntax:
for (a ->+ b) |x, index| {
// inclusive, another variant: a ~~+ b
}
for (a -> b) |x, index| {
// exclusive, another variant: a ~~ b
}
Not going to do this, when this is perfectly fine:
var i: usize = a;
while (i < b) : (i += 1) {}
I don't think lack of this range feature is causing any actual problems. However pay attention to the coroutine rewrite issue (#2377) because Zig may gain generators out of it.
It would be cool if we could have the i
in the local scope of the for/while. Just like in C.
for(int i=0; i < n; i++)
I know you can do:
{var i = 0; while (i < n; i += 1) {
}}
but it looks so verbose and ugly IMHO
Maybe it's just that I'm so used to the C syntax. But I think, the way C does it, is very simple, and easy to understand; no need for range syntax
It should be said that some of the discussion around this topic seems to be continued in #3110
Regarding @tuket's comment.. I've taught C/C++ at university, and I didn't find for loops very intuitive to teach. The syntax is frankly speaking quite dumb. There's nothing else in the language that works anything like for loops do, and the name tells you nothing about what the statement is actually doing. I am also completely used to how for loops work, and find it nicer than what Zig currently has... but it's not worth adding any special syntax just for "while"... it should be something universal
Thanks for the link!
IDK, maybe I've been using it for so long that I've forgotten that it was hard to understand once. The C for
loop can be used for so many things that it's hard to give it a name that describes what it does.
Status quo hack:
for (([number]void)(undefined)) |_| {
//
}
From https://github.com/ziglang/zig/pull/3585#discussion_r343812610
Not going to do this, when this is perfectly fine:
var i: usize = a; while (i < b) : (i += 1) {}
I don't think lack of this range feature is causing any actual problems. However pay attention to the coroutine rewrite issue (#2377) because Zig may gain generators out of it.
I am looking into Zig as a potential language for HPC because of its relative simplicity while providing modern metaprogramming, and having poor ergonomics on a range for
is a dealbreaker: nested loops over many dimensions, where each dimension might also be tiled, are extremely common.
Maybe facilities like multi-variable for
sugar, or custom operators, should be provided by third-party lang extensions, while keeping the core language lean? I keep seeing people suggesting abstract source transforms for math and HPC, but in my eyes a major selling point for Zig is that the core language is lean, explicit, and mostly readable (which is a huge plus for a systems team lead.)
I can certainly see it from both sides, and I think that a good compromise is the ability to extend a lean core language with source transforms or similar source-level metaprogramming capabilities (like in Racket.) Highly expressive syntax is imperative for some users, and a deal-breaker for others, and "sort of expressive" is a bad compromise.
Failing that, I think most Zig users are coming at it primarily from a "Better C" angle, where they need explicit, detailed, readable source code, and that is also my preference.
Maybe facilities like multi-variable
for
sugar, or custom operators, should be provided by third-party lang extensions, while keeping the core language lean?
Look, I'm not asking for multi-variable for
loops, that's just an example. The issue is that Zig doesn't have a decent single-variable for
loop atm.
Something should be done about this because {var i = 0; while (i < n; i += 1) { … }}
is both hard to read and hard to write
A workaround right now that supports runtime numbers is
pub fn range(max: usize) []const void {
return @as([]const void, &[_]void{}).ptr[0..max];
}
…
for(range(25)) |_, i| {
print(i);
}
If #6965 gets accepted, a function could be made to do this with no extra runtime cost and no extra language features:
range(0, 25, |i| {
print(i);
});
inline fn range(start: usize, end: usize, body: macro(i: usize) void) void {
var i: usize = start;
while(i < end) : (i += 1) {
inline body(i);
}
}
Whenever iteration over a range is desired, there is typically some data structure to act on, so a structural loop would be better -- this may not be immediately obvious, so I think it's actually a good thing that ranged looping is clunky, because then the author is forced to think twice whether that's what they actually want. This functionality is already possible, and we shouldn't make it easier. As noted, a #6965-based solution will be clean, and I don't think we should have tighter language integration than that.
there is typically some data structure to act on, so a structural loop would be better
Not true for my field of programming. There are many use cases where you want to iterate a number of times but you don't have a data structure. For example raytracing.
for(int i = 0; i < numSamples; i++) {
ray = randomRay(i);
// ...
}
I don't think it's a good idea to make the syntax clunky derivelately just because we think it's not a good way to do it. Different programmers, work of different problems, and we don't know what is right for them.
Huge +1 for the for(a..b)
syntax. Zig desperately needs this, and:
var i: usize = a;
while (i < b) : (i += 1) {}
Is most definitely not "perfectly fine" if you're doing a lot of iteration. It's also not nice to read. I don't see what's so bad about OP's proposal. Zig aims to be conservative in the wrong ways sometimes. It has an abundance of features it doesn't shy from, but not basic iteration primitives that exist in nearly every other imperative language?
I really don't believe so many people here (especially C programmers) have not used for loops to count or do something a number of times without having a data structure for it.
@andrewrk regarding https://github.com/ziglang/zig/issues/358#issuecomment-491004876
var i: usize = a; while (i < b) : (i += 1) {}
A significant issue here is that the current while
construct makes it easy to introduce subtle bugs due to the larger scope of referenced variables.
I think a good middle-ground could be to add an optional initializer-clause to while
:
while (var i: usize = a; i < b) : (i += 1) {}
which is backwards-compatible syntax sugar for an additional scope/block, i.e.
{
var i: usize = a
while (i < b) : (i += 1) {}
}
Doing this manually for nested loops is not something I think most people are going to do.
The point is to avoid polluting the outer scope, and to restrict the scope of i
, thus avoiding subtle bugs. It's why C99 allowed declarations to go into the initializer clause of the for loop.
Adding braces could be allowed to put an arbitrary amount of statements in the initializer.
Of course, the above syntax kind of begs for the continue-expression to be moved into the while as well:
while (var i: usize = a; i < b; i += 1)
leaving us with a for-loop called while 😅 But I would be happy to see the initializer-clause only.
I know this proposal is closed already, but one thing to think about is this:
for(0..@as(u8, 255)) |index| {
...
}
is way more readable and less error prone than
var index: u8 = 0;
while(true) {
...
if(@addWithOverflow(u8, index, 1, &index))
break;
}
So my call would be: if done, the range needs to be inclusive to allow such loops to be implemented efficiently and without the requirement of @intCast
or @truncate
FWIW you can always increment a variable with defer
if you don't like : (i += 1)
while(...) {
defer i = i + 1;
}
FWIW you can always increment a variable with
defer
if you don't like: (i += 1)
while(...) { defer i = i + 1; }
This does not have the same semantics as the : (i +=1)
continue expression as the defer will be executed on breaking from the loop while the continue expression will not.
FWIW you can always increment a variable with
defer
if you don't like: (i += 1)
while(...) { defer i = i + 1; }
This does not have the same semantics as the
: (i +=1)
continue expression as the defer will be executed on breaking from the loop while the continue expression will not.
Consider the following snippet:
const std = @import("std");
const warn = std.debug.warn;
pub fn main() void {
var i: u8 = 0;
while (i < 10) {
defer i += 1;
warn("{}\n", .{i});
}
}
The output is:
0
1
2
3
4
5
6
7
8
9
It's easy to overlook but each iteration of a loop has its own frame/scope. In golang you'd be right but defer applies to the immediate scope in zig.
@nodefish:
const std = @import("std");
const print = std.debug.print;
pub fn main() void {
print("loop 1:\n", .{});
var i: u8 = 0;
while (i < 10) {
defer i += 1;
print("{}\n", .{i});
if (i == 5) break;
}
print("{}\n", .{i});
print("loop 2:\n", .{});
i = 0;
while (i < 10) : (i += 1) {
print("{}\n", .{i});
if (i == 5) break;
}
print("{}\n", .{i});
}
loop 1:
0
1
2
3
4
5
6
loop 2:
0
1
2
3
4
5
5
@nodefish Consider the following snippet:
const std = @import("std");
const info = std.log.info;
pub fn main() void {
var i: u8 = 0;
while (i < 10) {
defer i += 1;
break;
}
warn("{}", .{i});
}
It prints 1. : (i += 1)
would have resulted in printing 0. That's what Isaac was saying. Dude's been here for years, please don't be condescending.
@MasterQ32 See, I would have expected that to be an exclusive range, as that's how ..
is currently used, and that's also the most sensible way to do it for for
(so that we can choose to iterate 0 times). Also, is there a case in real code where you'd actually want to do this, and you're not iterating over an array/slice?
@cryptocode No good -- ;
means sequence, and Zig's grammar is intentionally very simple, so if that were allowed at all both clauses would be executed every time.
The fundamental issue is: almost every single time you want to iterate over a range in real code, it's actually to index into a data structure. The sensible way to structure your code is to iterate over the structure directly, and the lack of ranged for
is a subtle nudge in this direction. If structural for
really doesn't work, while
is still available, but the relative awkwardness means you don't use it unless it's the most sensible way. There are cases where structural for
is awkward where it really shouldn't be, and #7257 exists to remedy that, but we shouldn't make it any more capable than we absolutely have to.
Really, the problem is that for
is poorly named. It evokes ranged iteration. I'd be in favour of renaming it, but that's a separate issue.
@EleanorNB I agree with the range point of view, I'm just addressing the scoping issue. I don't see people adding additional {} scopes in complex/nested loops, leaving the door open to subtle bugs. I see how the ; sequence means a difference syntax is needed, but that's orthogonal.
Where
a
andb
can be chars, integers, anything that can define a range. This is also better syntax IMHO than: