ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
https://ziglang.org
MIT License
33.64k stars 2.46k forks source link

Rework `std.sort` #20830

Closed VisenDev closed 3 weeks ago

VisenDev commented 1 month ago

I propose reworking std.sort to use an interface in a similiar way to how the Reader/Writer interface was reworked.

Proposed Changes

To complete this proposal, an interface type similar to the following would be added to std.sort

pub const Sortable = struct {
   ptr: *anyopaque,
   vtable: *const VTable,

   pub const VTable = struct {
      lessThan: *const fn(ctx: *const anyopaque, a: usize, b: usize) bool,
      len: *const fn(ctx: *const anyopaque) usize,
      swap: *const fn(ctx: *anyopaque, a: usize, b: usize) void,
   };

   pub fn lessThan(self: *@This(), a: usize, b: usize) bool {
      return self.vtable.lessThan(self.ptr, a, b);
   }

   pub fn greatThanOrEqual(self: *@This(), a: usize, b: usize) bool {
      return !self.lessThan(a, b);
   }

   pub fn len(self: *const @This()) usize {
      return self.vtable.len(self.ptr);
   }

   pub fn swap(self: *@This(), a: usize, b: usize) void {
      return self.vtable.swap(self.ptr, a, b);
   }

   pub fn isSorted(self: *const @This()) bool {
      var i: usize = 1;
      const len = self.len();
      while (i < len) : (i += 1) {
         if (self.lessThan(i, i - 1)) {
            return false;
         }
      }

      return true;
    }

   // ... and so on
};

Sorting functions now take a std.sort.Sortable as a parameter and perform the sorting operations that way. That greatly simplifies the usage of these functions.

For example, the heap sort function signature goes from this

pub fn heap( comptime T: type, items: []T, context: anytype, comptime lessThanFn: fn (@TypeOf(context), lhs: T, rhs: T) bool, ) void

to this

 pub fn heap(sortable: Sortable) void

The stdlib will also be able to provide several implementations of std.sort.Sortable for common use cases, for example

pub fn ascendingString(allocator: std.mem.Allocator, strings: [][]const u8) !std.sort.Sortable {
    var result = try allocator.create(std.sort.Sortable);
    result.vtable = try allocator.create(std.sort.Sortable.VTable);

    result.ptr = @ptrCast(strings.ptr);
    result.vtable.len = &(struct {
         pub fn len(ptr: *anyopaque) usize {
             const casted:[][]const u8 = @ptrCast(ptr);
             return ptr.len;
        }
    }.len);

    //and so on

    return result;
}

Motivation

The motivation for this issue was a general dislike of the usage of anytype in the standard lib as a replacement for what really should be an interface. This is not user friendly and makes it difficult to understand what a function's parameters need to be just from the function's type signature.

For example, consider this function in std.sort

pub fn heapContext(a: usize, b: usize, context: anytype) void {
    assert(a <= b);
    // build the heap in linear time.
    var i = a + (b - a) / 2;
    while (i > a) {
        i -= 1;
        siftDown(a, i, b, context);
    }

    // pop maximal elements from the heap.
    i = b;
    while (i > a) {
        i -= 1;
        context.swap(a, i);
        siftDown(a, a, i, context);
    }
}

Without looking at the source code directly, its not really possible for the user to know that the context parameter needs to be a struct with the swap method defined.

However, if this function took a std.sort.Sortable, the type signature would be well defined and easy to understand.

=====================

Note: The exact details of the interface are up for change, this is just the basics for one possible implementation

VisenDev commented 1 month ago

Other advantages:

matklad commented 1 month ago

The big difference here is that for allocators&IO, virtualized functions are very cold, and the cost of indirect call (or more specifically, the cost of missing inline), is genreally dominated by syscall (in the IO case) or the work to, eg, initialize the allocated memory (for allocators).

For sorting, comparision operator is the opposite, it is very hot.

Without running any benchmarks, my baseline hypothesis is that this'll prohibitively reduce performance for some workloads.

Another way to look at it is that read, write, and alloc are "vector" operations, they operate on many elements at a time, and a virtual call in a "control plane" is essentially free (in fact, it can be negative-cost, due to code size savings).

In contrast, for sorting what we want to virtualize are "scalar", "data plane" operators, which are called once for every element of the input, rather than just once for an entire batch of inputs.

TheHonestHare commented 4 weeks ago

The motivation for this issue was a general dislike of the usage of anytype in the standard lib as a replacement for what really should be an interface. This is not user friendly and makes it difficult to understand what a function's parameters need to be just from the function's type signature.

I agree that the status quo way of implementing interfaces via anytype, and relying on devs abilities to write good doc comments isn't exactly the best. A better solution is needed; however that's a separate proposal. But for starters the various sort implementations would benefit from having a verifyContext function, akin to std.HashMaps function of the same name, called at the top of their bodies. This would serve as good documentation for what the context params actually expect.

However, moving the interface to a runtime interface is not a good idea, for several reasons:

  1. Function pointers inhibit several optimizations the compiler can make, such as aggressive inlining
  2. Sort functions are often measured by how many comparisons and swaps they must do, and these numbers grow big. Needing to call both these operations by function pointer I imagine would be unacceptable for performance
  3. Implementing your own Sortable would be slightly harder and less type safe, as @ptrCast is needed to cast the context type as opposed to just relying on the type system and compile errors. Passing in a custom lessThanFn is a common use case, it should not be made more verbose to do this

The implementation of this interface would also likely reduce the amount of comptime function instantiation - which should help reduce both compile times and binary size bloat

If this were to ever actually be implemented, I would desperately want there to be benchmarks and measurements made for code slowdown and bloat decrease respectively. I highly doubt that std.sort is a major source of bloat in projects.

Standard library data structures like Arraylist could expose functions that automatically derive std.sort.Sortable from themselves

Standard library could similarly expose a .sortContext method that returns a struct that satisfies std.sort.verifyContext. Unrelated, but ArrayList can already trivially be used with the heap helper function:

const T = std.meta.Child(list.items);
heap(T, list.items, {}, std.sort.asc(T));
andrewrk commented 3 weeks ago

Sorting should be compile-time functions for sure.