ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
https://ziglang.org
MIT License
35.19k stars 2.57k forks source link

Zig's static analysis for undefined memory access doesn't error where C++'s tooling does error #18440

Open Jarred-Sumner opened 11 months ago

Jarred-Sumner commented 11 months ago

Zig Version

0.12.0-dev.1830+779b8e259

Steps to Reproduce and Observed Behavior

The following Zig compiles successfully:

zig build-lib ./Hello.zig
const Garbagey = struct {
    garbage: u32 = undefined,
};

export fn use_garbage() bool {
    var garb = Garbagey{};

    if (@import("std").os.getenv("USE_GARBAGE") != null) {
        garb.garbage = 1;
    }

    if (garb.garbage == 0) {
        return true;
    }

    return false;
}

The following C++ also compiles successfully:

clang++ -c Hello.cpp
#include <cstdint>
#include <cstdlib>

class Garbagey {
public:
  uint32_t garbage;
};

extern bool use_garbage() {
  Garbagey garb;

  if (getenv("USE_GARBAGE")) {
    garb.garbage = 1;
  }

  if (garb.garbage == 0) {
    return false;
  }

  return true;
}

However, when run through clang-tidy, it produces the following warnings:

clang-tidy Hello.cpp --checks=clang-analyzer-*
Hello.cpp:16:20: warning: The left operand of '==' is a garbage value [clang-analyzer-core.UndefinedBinaryOperatorResult]
  if (garb.garbage == 0) {
      ~~~~~~~~~~~~ ^
Hello.cpp:12:7: note: Assuming the environment variable does not exist
  if (getenv("USE_GARBAGE")) {
      ^~~~~~~~~~~~~~~~~~~~~
Hello.cpp:12:3: note: Taking false branch
  if (getenv("USE_GARBAGE")) {
  ^
Hello.cpp:16:20: note: The left operand of '==' is a garbage value
  if (garb.garbage == 0) {

If we change the code to instead have no undefined memory accesses:

extern bool use_garbage() {
  Garbagey garb;

  if (getenv("USE_GARBAGE")) {
    garb.garbage = 1;
  } else {
    garb.garbage = 0;
  }

  if (garb.garbage == 0) {
    return false;
  }

  return true;
}

clang-tidy detects that undefined memory is unreachable and no longer warns.

Expected Behavior

Zig should have builtin safety checks that are at least as comprehensive as what C++ tooling provides, if not more.

xdBronch commented 11 months ago

although obviously not a solution to this problem, its vaguely related. i think undefined should not be allowed as a default value for fields, its a terrible terrible pattern

nektro commented 11 months ago

it would still be a valid issue if the field init wasnt there and the var init was var garb: Garbagey = undefined;

rohlem commented 11 months ago

i think undefined should not be allowed as a default value for fields, its a terrible terrible pattern

@xdBronch Why do you think so? IMO undefined perfectly signals that the field's value won't be used. If it's f.e. an internal field only used after being set by a method, it can make perfect sense not to burden every instantiation site with setting it to undefined. As another example, std.BoundedArray defaults its length to 0 and its elements to undefined, because that is the most correct formulation to make .{} coerce to its empty state. (Its elements logically don't exist, therefore their values shouldn't be read from as if they did exist.)

Setting a non-undefined default value in these situations is strictly worse, because it incorrectly signals that the field's value would be used when in fact it won't (meaning we use the opportunity for both safety and optimizations).

xdBronch commented 11 months ago

i just feel that it goes against some of the ideas that zig tries to push. the existence of undefined makes it so that youre less likely to have undefined memory problems because it always has to be explicitly done e.g. var foo: u32 = undefined;, just you typing this out makes it 10x safer than the C equivalent. when a field defaults to undefined its hidden from the user for imo very little reason which is more likely to cause the problems this issue aims to solve. BoundedArray is a fair enough use case but even then i think it would be better fit by making it part of an init function

matu3ba commented 11 months ago

Sounds like Usage of Uninitialized Memory (UUM) static analysis checker you are asking for as being part of point 6 in https://github.com/ziglang/zig/issues/2301#issuecomment-804496642.

Related could be moving AstGen to std.zig, which is pending "rework comptime mutation" and would be the first step to play with doing it out of tree with AstGen, but this would not be able to capture comptime semantics.

matu3ba commented 11 months ago

i just feel that it goes against some of the ideas that zig tries to push

Zig enables to write correct programs, which includes the absence of any value, if there would be no valid initial value. Default initialization would incur always an extra cost. Further more, default initialization can hide logic bugs. So you could have an incorrect program, which may be safer.