getty-zig / getty

A (de)serialization framework for Zig
https://getty.so
MIT License
189 stars 13 forks source link

Properly slice sentinel-terminated strings #99

Closed ibokuri closed 1 year ago

ibokuri commented 1 year ago

Problem

Sentinel-terminated buffers are often preallocated with extra memory to minimize allocations. For example, a buffer may have space for 10 elements, but only 3 of those elements are actually filled (4th element is a sentinel and the rest are undefined).

Currently, such buffers, when serialized as strings (e.g., [:0]u8), are passed in their entirety to serializers via serializer.serializeString. As a result, serializers have to handle these pre-allocated strings explicitly, which is annoying to do. Not only that, it doesn't even make any sense to serialize past a sentinel character anyway, so serializers shouldn't have to deal with all of this in the first place.

Proposal

In the default string SB, we should be able to add a call to std.mem.indexOfSentinel in order to find the sentinel position, if one exists, for incoming strings. Then, we simply slice the string up until the sentinel and pass it along to serializer.serializeString as usual.

Non-sentinel-terminated strings will just be processed like they are now.

Additional Context

Thanks to @Namek for pointing out this issue and for providing a repro!

const std = @import("std");
const json = @import("json");

pub fn main() !void {
    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
    const allocator = gpa.allocator();
    defer _ = gpa.deinit();

    var value = allocCString(allocator, "abc", 10);
    defer allocator.free(value);

    const slice = try json.toSlice(allocator, value); // error.Syntax
    defer allocator.free(slice);

    std.debug.print("val: {any}\n", .{value});
    std.debug.print("str: {s}\n", .{slice});
}

fn allocCString(allocator: std.mem.Allocator, comptime text: []const u8, bufSize: u32) [:0]u8 {
    var buf = allocator.allocSentinel(u8, bufSize, 0) catch unreachable;
    _ = std.fmt.bufPrintZ(buf, text, .{}) catch unreachable;

    return buf;
}