Closed paoda closed 2 years ago
Changing the implementation of indexOfScalarPos
from:
pub fn indexOfScalarPos(comptime T: type, slice: []const T, start_index: usize, value: T) ?usize {
var i: usize = start_index;
while (i < slice.len) : (i += 1) {
if (slice[i] == value) return i;
}
return null;
}
to:
pub fn indexOfScalarPos(comptime T: type, slice: []const T, start_index: usize, value: T) ?usize {
var i: usize = start_index;
while (i < slice.len) : (i += 1) {
if (meta.eql(slice[i], value)) return i;
}
return null;
}
resolves this issue and the program in my original comment will then proceed to compile. However, I can't comment on the performance implications of such a change, or whether this would even be the right way of solving this.
I can't comment on the performance of that fix either; but whether the solution is that, or passing in a custom equality function (perhaps via a context like HashMap uses), I believe there are clear use cases for which ==
is insufficient.
Even basic algorithms like Dijkstra will require the association of a value to its priority; and as far as I'm aware, this requires the use of a struct as the priority queue type, something to the effect of:
pub const DijkstraNode {
node_val: T,
priority: u32
};
Namely because, there isn't a good way (that I'm aware of) to implement the required comparison function for a priority queue unless the value and the priority are contained within the same struct. There are obviously some current limitations to closures/lambda capturing in Zig, as discussed particularly near the bottom of #1048, so if you had a priority queue of T
values (even assuming T
supports ==
), I don't see how you could implement that comparison function without a struct like this, even if you maintained the mapping of values to priority in a separate structure.
In Dijkstra's specifically, it is of note that it wouldn't require this association; you could just insert duplicate elements; but I would think that there are still cases where this behavior could be problematic.
I think is a missuse of std.mem.indexOfScalar
the name suggest it shoud be used only on scalar values that are numerical (or so I understand), so really priorityQueue should use std.mem.indexOfPos
since that one uses internally std.mem.eql and would also fix it, think is, that std.mem.indexOfPos
need more than one value, so maybe a better proposal would be adding a function that takes 1 element and makes it an slice and pass it to indexOfPos?
Ran into this today while doing Djikstra too. I propose that the update function shouldn't do any searching in the queue for you. Instead, it should take in an index of the item to update, and then just update it with the new value. Users can write their own version of std::find
customized to whatever they want.
Here's a code sample of what I'm thinking of. This is code from AoC day 15 I was working on, when I wanted to implement addOrUpdate to the priority queue. Bear in mind that this is fantasy code so it might not actually be correct 😅
fn addOrUpdate(queue: *DjikstraPQ, item: QueueDatum) !void {
var foundIdx: usize = 0;
// std::find
var it = queue.iterator();
while (it.next()) |elt|: (foundIdx += 1) {
// Users should change this to implement their own find function
if (elt.pos == item.pos) break;
}
if (found != queue.count()) {
queue.update(foundIdx, item) catch unreachable; // in this case, update takes an index instead
} else {
try queue.add(item);
}
}
As @Chris3606 mentions, adding duplicate elements into the queue is definitely a workaround, but is not a good long term solution.
I took a look at what a Rust crate does, looks like they take the element in but they also have a requirement that the elements inserted must be hashable, and there is an internal hash table that keeps track of which element is in which index. I don't think that is a good solution in general, but is interesting to note.
Closed in https://github.com/ziglang/zig/pull/10342?
Closed in #10342?
I don't believe so. The update
function in that implementation is still using indexOfScalar
: see here, and that function uses ==
which is what was causing the original issue.
Additionally, I wanted to add a bit to the discussion on the API from above: I agree with @thomastay insofar as an update
function that does a linear search doesn't necessarily support all use cases very well. In fact, in an optimized Dijkstra or particularly an A* implementation, for example, although I haven't formally tested it, I believe that even if the update
function did work when T
is a struct, I'd still prefer creating duplicate elements over the current implementation of update
for the sake of performance.
Particularly since in theory (if the API allowed a user to cache nodes themselves and the data types were cached in a way accessible in constant time) it should be possible to have a constant-time lookup of the node then simply the O(log(n))
update operation, I think there would be significant performance benefits in some cases to being able to use an O(log(n))
update
function over adding duplicate nodes, since it would be the same operation but without the additional memory usage and items to process in the queue.
I have tried my Aoc 2021 solutions with #12154: day15a: https://nest.pijul.com/voroskoi/aoc2021-zig:main/37QWQTZRMBJPE.EMAAA day15b: https://nest.pijul.com/voroskoi/aoc2021-zig:main/QBPUYYX77BA2W.VIAAA
In ReleaseFast first part 1100 us -> 2200 us, the second part 33.000us -> 105.000 us.
PriorityQueue use a binary heap under the hood, so we can not use std.sort.binarySearch()
for finding elements. I have no better idea, than the rust solution mentioned before: using a HashMap for storing the array positions to make lookup faster. On the other hand the HashMap just creates overhead when You do not use update()
...
Zig Version
0.9.0-dev.1275+ac52e0056
Steps to Reproduce
Given the
main.zig
described below:run
zig build
in order to reproduce the bug.Expected Behavior
I expected the zig program to compile.
Actual Behavior
I received the following from
stderr
:Which suggests to me that either this is an oversight or it should be made clear that there is an equality operator constraint on std.PriorityQueue.update or std.PriorityQueue as a whole.