ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
https://ziglang.org
MIT License
34.88k stars 2.55k forks source link

Zig's null pointer safety checks are less safe than C++ (`-fsanitize=null`) #21495

Open Jarred-Sumner opened 1 month ago

Jarred-Sumner commented 1 month ago

Zig Version

0.13.0

Steps to Reproduce and Observed Behavior

The following zig code which passes a null pointer to a non-nullable pointer does not panic in debug builds:

const Something = opaque {
    pub fn thing(this: *Something) *Something {
        return this;
    }
};

export fn something_ptr() ?*anyopaque {
    return null;
}

const SomethingPtr = *const fn () callconv(.C) *Something;

test "get something" {
    const something = @extern(SomethingPtr, .{ .name = "something_ptr" });

    var some: *Something = something();

    const std = @import("std");
    std.debug.print("*p = {}\n", .{some.thing()});
}

The following C++ code passes a null pointer to a member function that returns a reference (non-nullable pointer):

#include <iostream>

class Something {
public:
  Something &thing() { return *this; }
};

extern "C" Something *something_ptr() { return nullptr; }

typedef Something *(*SomethingPtr)();

int main() {
  SomethingPtr something = something_ptr;

  Something *some = something();

  std::cout << "*p = " << &some->thing() << std::endl;
  return 0;
}

Running the C++ code produces this (3 runtime errors):

$ clang++ -fsanitize=null repro.cpp
$ ./a.out
repro.cpp:17:34: runtime error: member call on null pointer of type 'Something'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior repro.cpp:17:34 
repro.cpp:5:14: runtime error: member call on null pointer of type 'Something *'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior repro.cpp:5:14 
repro.cpp:5:31: runtime error: reference binding to null pointer of type 'Something'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior repro.cpp:5:31 
*p = 0x0

Running the Zig code produces this (0 runtime errors, test passed):

❯ zig test /Users/jarred/Code/bun/repro.zig
*p = repro.Something@0
All 1 tests passed.

Expected Behavior

Zig's safety checks should be better than C++'s.

190n commented 1 month ago

There are at least two places Zig could add a safety check that would have caught this:

IMO both these checks should be added: the first one because it catches issues like this as early as possible, and the second one because it'll never be possible to 100% prevent the creation of pointers which aren't supposed to be null but actually are -- so it's still valuable in safe builds to check whether one exists.

The other question for me is: should it even be possible to have non-optional non-allowzero pointers in extern return values or export function parameters? It may be valuable to require an explicit unwrap in cases where you can't actually prove that external code will never give you a null value.

mlugg commented 1 month ago

Note that if you write C[++] which is more analagous to the Zig code, you can bypass UBSan. However, the code does look significantly weirder, because you need Clang attributes to match the expressiveness of Zig's type system. At https://godbolt.org/z/escTqKjs4 I have a similar example in C which triggers a segfault with -fsanitize=undefined. It uses a few Clang attributes:

That's not to say we don't want a safety check here, but the examples in the original post aren't really comparable. You're using Zig's type system to express constraints which C[++] doesn't have a way to specify, and hence will not assume.

xdBronch commented 1 month ago

There are at least two places Zig could add a safety check that would have caught this:

how would you bypass/disable these checks with @setRuntimeSafety? being able to do this is important imo

190n commented 1 month ago

how would you bypass/disable these checks with @setRuntimeSafety?

For the first case I proposed, extern calls in a block with safety disabled could disable the check on the return value. For the second case, functions where the function scope has safety disabled could skip checks on incoming arguments.

being able to do this is important imo

Why? IMO if you expect to have pointers that are zero, you should either make them optional to indicate that null is a special value that needs to be handled separately, or make them allowzero to indicate that null should be treated as a valid pointer. I don't see what the legitimate use case is for creating or passing around pointers that shouldn't be null according to the type system, but actually are null. Some safety checks already try to prevent that behavior; for example @as(*T, @ptrFromInt(0)) fails in safe builds but it works with ?*T or *allowzero T. This is also why coercing **T to *?*T isn't allowed, because it could be used to write a null pointer into a place where Zig believes there shouldn't be a null pointer.

xdBronch commented 1 month ago

hmm, extern cant be used in blocks so youd need to do some messy stuff with a struct inside the block declaring the extern, maybe @extern could be used there i considered what you said for the latter but that just seems a little odd, if the safety check is checking the parameters my mental model is that it happens before the function in a sense

IMO if you expect to have pointers that are zero, you should either make them optional to indicate that null is a special value that needs to be handled separately, or make them allowzero to indicate that null should be treated as a valid pointer

no thats exactly the point, theyre not null which is exactly what im telling the compiler by using a non-null pointer. the problem is enforcing a safety check for it

190n commented 1 month ago

hmm, extern cant be used in blocks so youd need to do some messy stuff with a struct inside the block declaring the extern, maybe @extern could be used there

What I meant is if you call an extern function in a block with safety disabled, it doesn't check the arguments you pass to the extern function.

theyre not null which is exactly what im telling the compiler by using a non-null pointer. the problem is enforcing a safety check for it

Why is it a problem to enforce that something you don't think should be null is actually never null? The safety will ofc be turned off in unsafe builds, and it could probably even be optimized out a lot of the time in ReleaseSafe if the compiler can prove that the value is really never null (especially for zig<=>zig calls).

xdBronch commented 1 month ago

What I meant is if you call an extern function in a block with safety disabled

oh duh, oops

Why is it a problem to enforce that something you don't think should be null is actually never null

i just think its worthwhile to be able to let the user control this, are there any other safety checks that cant be disabled? to be clear i dont necessarily have a problem with the safety check itself, although it seems a tad strange, i just want to make sure it can be disabled

Jarred-Sumner commented 1 month ago

The original post aren't really comparable. You're using Zig's type system to express constraints which C[++] doesn't have a way to specify, and hence will not assume.

From the C++ 20 standard §9.3.2:

A reference shall be initialized to refer to a valid object or function. [Note: In particular, a null reference cannot exist in a well-defined program, because the only way to create such a reference would be to bind it to the “object” obtained by indirection through a null pointer, which causes undefined behavior.

I'm having a hard time seeing how the results are not comparable

xdBronch commented 1 month ago

in the zig code you're tricking the type system by casting a function returning a nullable pointer to one returning a non-null pointer but in the C++ code all the pointers are allowed to be null, you're just coercing it to a reference. these definitely don't feel equivalent. it seems like the c++ code is closer to @as(*Something, @ptrCast(an_optional_something)) which zig does catch (or maybe more accurately an_optional_something.?)