Inline slices - Githubissues

Jarred-Sumner commented 1 month ago

In Zig, the following code is a use-after-stack:

const Wrapper = struct {
  single_value: u32,
  pub fn slice(self: @This()) []const u32 {
    return &.{self.single_value};
  }
};

pub fn doThing(wrapper: Wrapper) []const u32 {
  return wrapper.slice();
}

This is extremely easy to do.

The easiest way to fix it is something like this:

const Wrapper = struct {
  single_value: u32,
  pub fn slice(self: @This(), allocator: std.mem.Allocator) ![]u32 {
    const result = try allocator.alloc(u32, 1);
    result[0] = self.single_value;
    return result;
  }
};

pub fn doThing(wrapper: Wrapper, allocator: std.mem.Allocator) ![]const u32 {
  // allocate a new slice
  return try wrapper.slice(allocator);
}

But, now we're doing a heap allocation for a single 4 byte value that occupies less space than a slice (16 bytes on x64). This is not a good use of memory.

The 2nd-easiest way to fix it is something like this:

const Wrapper = struct {
  single_value: u32,
  pub fn slice(self: @This(), buf: []u32) usize {
    buf[0] = self.single_value;
    return 1;
  }
};

pub fn doThing(wrapper: Wrapper, buf: []u32) usize {
  // return how many were written to the slice
  return wrapper.slice(buf);
}

We can make it the caller's job to own the memory. And often times, we should do it that way.

However, sometimes we can't or shouldn't. Sometimes, passing fixed-length buffers ahead of time makes the API significantly more difficult to reason about.

Instead, what if we could do it that first way, but without a heap allocation?

const Wrapper = struct {
  single_value: u32,
  pub fn slice(self: @This()) []inline u32 {
    // return a slice with the contents passed by value to the caller 
    // if it fits in the amount of stack space allocated for the slice
    return &.{self.single_value};
  }
};

The compiler knows this is a slice of a single value. The slice itself occupies 16 bytes. Only 4 of those bytes are truly in use.

Slices do not have a defined memory layout (they're not extern or packed).

Therefore, the unused bytes of slices could be used for other things.

You could store at least 3 u32s in this inline slice, and have space for a length field to answer how many values, or if the []inline T is represented as a []T (a non-inline slice)

Then, there's no risk of use-after-stack (since it would be copied by value), and the code is as simple to reason about as a single value.

Why not just an array?

Arrays don't store a length. You can make up a convention for each usage, but slices in Zig are nice.

What about using slices and inline slices in the same code?

The []inline T syntax could also be used to point to a regular []T slice. The compiler can decide how to represent the slice in a way to disambiguate which kind it is.

InKryption commented 1 month ago

This seems like a complication of the language that offers little in comparison to simply using std.BoundedArray. It also doesn't actually address the safety concern of UAF, since it is an opt-in feature that would require the programmer to know they're returning a stack pointer in the first place, which they wouldn't if it's by mistake.

mlugg commented 1 month ago

This is a wholly unreasonable language complication over just using std.BoundedArray. As InKryption points out, it fails to actually address the UAF concern. In fact, it makes the problem much worse: syntax which is today clearly incorrect (return &.{runtime_val};) becomes ambiguous under this proposal, depending on type information, and hence masking UAF bugs. This proposal as written also has some obvious holes:

What is @sizeOf([]inline T)? Equivalently, how do use sites know how much stack memory to allocate for them?
Are they compatible with other pointer attributes? Why or why not?
Why overload the inline keyword here?
Why do these use "slice-esque" syntax, despite being arrays?
You haven't explained the semantics of []inline T in any way other than that they can be returned from a function. Can you coerce between them and normal slices? Do they support slicing syntax, and if so, what does it return? etc.

This proposal seems pretty clearly DOA.

silversquirl commented 1 month ago

There's an easier way to do this that requires no changes to the language:

pub fn slice(self: *const @This()) *const [1]u32 {
    return &self.single_value;
}

Single-pointers can be coerced to single-pointers to 1-element arrays, which of course can then be coerced to slices wherever you need them. If you want, you can also obviously use @as to keep the []const u8 return type.

castholm commented 1 month ago

You also have the option of making the function that returns the slice an inline fn:

const std = @import("std");

pub const Wrapper = struct {
    value1: u32,
    value2: u32,
    pub inline fn slice(self: @This()) []const u32 {
        return &.{ self.value1, self.value2 };
    }
};

pub fn main() void {
    const w: Wrapper = .{ .value1 = 1, .value2 = 2 };
    const slice = w.slice();
    std.debug.print("{any}\n", .{slice});
}

{ 1, 2 }

Edit: This appears to work currently, but it might not be intended to be legal, see @InKryption's comment below.

N00byEdge commented 1 month ago

I don't think this is needed nor desired, we can do the following:

pub const Wrapper = extern union {
    as_values: extern struct {
      value1: u32,
      value2: u32,
    },
    as_array: [2]u32,
    pub inline fn slice(self: *const @This()) []const u32 {
        return &self.as_array;
    }
};

or even

pub const Wrapper = extern struct {
    value1: u32,
    value2: u32,
    pub inline fn slice(self: *const @This()) []const u32 {
        return std.mem.bytesAsSlice(u32, std.mem.asBytes(self));
    }
};

InKryption commented 1 month ago

@castholm

You also have the option of making the function that returns the slice an inline fn:

Zig is lexically scoped; the fact that an inline function will share a stack frame with its caller doesn't mean returning a pointer to its locals is or should be valid.

ziglang / zig

Inline slices #21399

Why not just an array?

What about using slices and inline slices in the same code?