Closed notcancername closed 10 months ago
When
array_list.items.len
andarray_list.capacity
arestd.math.maxInt(usize)
,
note that this being true means the ArrayList has a length of 16 exabytes
edit: oh you're claiming this on 32bit which would be 4 gigabytes instead
I do not see why this is an issue. The code there is using non-wrapping addition, asserting that this will never overflow in valid API usage. As such, adding one to the ArrayList in this case where len == capacity == maxInt(usize)
is expected to be undefined behavior. A comment is all that is needed to indicate this fact as the code is already checking for this case with an assertion implicitly and defining the valid usage of the function.
Of course this was some more common case it'd make sense to change this to something stronger than an assertion, but I'd argue the code is fine as-is with asserting this edge case will not occur. Zig is mostly going to be used on 64 bit systems so this is a literal non-issue in such places, but even on 32 bit system it is not either. It is essentially impossible to allocate an ArrayList with 2^32 bytes of capacity as even the information to hold the ArrayList's capacity and slice to the data will already cause there to be not enough memory to hold a slice that large in the same address space. Adding code to check for essentially an impossible condition that can never be hit is just a waste of performance on something people will call often and binary size.
Zig is not meant to be a safe language across all cases (which is why UB exists to begin with in Zig to boost performance by not covering all cases in a defined way). Handling essentially pedantic cases like this which are impossible to hit in reality is not useful. The CVE you linked has to do more with things like signed int
or short
in C with much smaller and relatively achievable wraparound cases, but that is not the case with a usize
in the context of an allocation.
I do not see why this is an issue.
That says more about you than it says about this issue.
As such, adding one to the ArrayList in this case where len == capacity == maxInt(usize) is expected to be undefined behavior.
Causing illegal behavior when used as per the interface contract is unacceptable. No sane programmer would see
fn appendToken(array_list: *std.ArrayList(u8), byte: u8) !?[]u8 {
if(byte == ' ') return array_list.items;
try array_list.append(byte);
return null;
}
and think, "My goodness! Such illegal usage of std.ArrayList!".
Zig is mostly going to be used on 64 bit systems
It is an explicit goal of Zig to improve upon C. That also means being portable and not artificially limiting the reusability of code (also one of the core zig design goals). One can make the very same argument for OOM handling: "Zig is mostly going to be used on modern operating systems where there is enough RAM available and overcommit will allow a process to overallocate and be killed when attempting to use it". You are ignoring use cases that are not your own. There are projects using Zig in embedded applications where 32-bit systems are widespread. 16-bit support is also planned.
so this is a literal non-issue in such places,
In your particular use case. Just as OOM isn't an issue in it, either.
but even on 32 bit system it is not either. It is essentially impossible to allocate an ArrayList with 2^32 bytes of capacity as even the information to hold the ArrayList's capacity and slice to the data will already cause there to be not enough memory to hold a slice that large in the same address space. Adding code to check for essentially an impossible condition that can never be hit is just a waste of performance on something people will call often and binary size.
Once again, you are misled by your own use case, and even by my own oversimplified example. I assumed nobody would carelessly downplay the significance of an integer overflow in this case, but I'm sorry to say that I was wrong. Consider a more contrived, yet realistic use case:
// works with either 32-bit or 64-bit usize
fn readMessageChunk(wip_message: *std.ArrayList(u8), reader: anytype) !void {
const len = try reader.readInt(usize, .little);
try wip_message.ensureUnusedCapacity(len);
// ... read message
}
Again, a programmer could easily be misled into thinking this code were safe, when in fact, it isn't, because ensureUnusedCapacity
can cause an overflow, illegal behavior and worse, a security vulnerability. Assume wip_message.items.len
is 1
, perhaps because of a previous 1-length chunk, len
is std.math.maxInt(usize)
and wip_message.capacity
is 8
. The attacker has now triggered a heap-based, or in the worst case (FixedBufferAllocator
), stack-based buffer overflow from code that looks entirely safe. Nowhere is this indicated in the API contract of ensureUnusedCapacity
. To prevent such mistakes, the PR clearly and obviously declares all sources of illegal behavior in boldface and eradicates this particular issue altogether. This has been long overdue for such a core API as ArrayList
.
Zig is not meant to be a safe language across all cases (which is why UB exists to begin with in Zig to boost performance by not covering all cases in a defined way).
I agree, but in any case, the intent to be unsafe must always be obvious to the programmer. Communicate intent precisely. This is not the case here.
Handling essentially pedantic cases like this which are impossible to hit in reality is not useful.
I have demonstrated how such a case can easily be hit using the safe-looking code above.
The CVE you linked has to do more with things like signed int or short in C with much smaller and relatively achievable wraparound cases, but that is not the case with a usize in the context of an allocation.
You didn't even bother to look up any information on the CVE I linked. I wish I could say I expected better from you, but that would be a lie. This CVE doesn't exploit a 16-bit wraparound, or a 32-bit one, but a 64-bit one, as the researchers who discovered it explain in their talk. Congratulations, I have never seen anyone shoot themselves in the foot, pun fully intended, this hard.
Edit: altered it to be slightly less vitriolic.
You did not read my comment. You cannot allocate a ArrayList
with a size of maxInt(usize)
on any reasonable platform, it is physically impossible, so this case will never be hit, end of story.
As for strawmanning people by saying "No sane programmer would see and think, "My goodness! ... Such illegal usage of std.ArrayList!".", yes that is exactly what I'd expect them to think. Of course your example is deceptive, it looks like normal code and for every practical use case it is fine, but if you are talking about expecting the said ArrayList
to grow to the size of maxInt(usize)
, then yes every sane programmer I know would think that is illegal usage of the API, given the API implicitly forbids appending an item to an allocation that large by the assertion in the plus operator.
The CVE in question is also not relevant here as it has nothing to with the fact that to hit the case mentioned in the issue requires the pre-existing condition of an allocation of maxInt(usize)
, which again is a practically impossible case to begin with. I never said overflow bugs cannot happen with 64 bit numbers, but you are ignoring the context of preconditions to get to this state to begin with (and indeed, the comment about it not being a concern on 64 bit systems is because no one will ever have 16 EiB of RAM in their computer, thus triggering an OOM condition way before it gets to this point).
If you can prove this branch is ever hit on any operating system in existence, I will change my mind, otherwise you are adding dead code to a commonly called function in the standard library for an entirely theoretical case which is already implicitly disallowed as undefined behavior by the assertion of the non-wrapping addition operator usage.
Also I'd recommend you refresh yourself on what usize
represents. It does not matter if you are overcomitting memory and the OS is paging memory to disk or if you have more physical ram than what can be addressed. You are talking about trying to address more than is possible to address on a given platform, you cannot have an allocation this large so it is irrelevant how big the physical memory backing things is, so this case will never be hit. At the very edge case, the pointer is at address 0 (not possible fwiw because Zig's ArrayList
does not use allowzero
so you'd probably actually be limited to using address 1 in practice), and the last item of the array can be at address maxInt(usize)
, so the largest possible size the ArrayList
can hold is maxInt(usize)
and going beyond that is undefined behavior in the API right now.
It makes sense of course to say that should be an OOM condition in theory, but again the precondition that this can never be hit in the first place means it is fine to leave it as UB that does not affect runtime performance as not a single soul on Earth will ever benefit from this change. At most, it just needs a comment to say addOne
is UB to call if the ArrayList
is size maxInt(usize)
just to highlight that this is the assertion the code is making currently (as it may be a bit non-obvious), thus making sure if the caller truly cares about this impossibly exceptional case that they handle it themselves without bogging down everyone else's code with a useless few instructions.
You cannot allocate a ArrayList with a size of
maxInt(usize)
on any reasonable platform, it is physically impossible, so this case will never be hit, end of story.
It doesn't matter. You don't need the ArrayList
to be that large to exploit the overflow, which I clearly state in my post, most of which you straight up ignored.
Assume
wip_message.items.len
is1
, perhaps because of a previous1
-length chunk,len
isstd.math.maxInt(usize)
andwip_message.capacity
is8
.You did not read my comment.
More and more, I'm getting the impression that you're projecting here.
I never said overflow bugs cannot happen with 64 bit numbers [...]
Actually, that's exactly what you said:
The CVE you linked has to do more with things like signed int or short in C with much smaller and relatively achievable wraparound cases, but that is not the case with a usize in the context of an allocation.
(emphasis mine).
If you can prove this branch is ever hit on any operating system in existence, I will change my mind [...]
Gladly.
const std = @import("std");
fn readMessageChunk(wip_message: *std.ArrayList(u8), reader: std.io.AnyReader) !bool {
const len = try reader.readInt(usize, .little);
if(len == 0) return false;
try reader.readNoEof(try wip_message.addManyAsSlice(len));
return true;
}
pub fn main() !void {
var ally_s = std.heap.GeneralPurposeAllocator(.{}){};
defer _ = ally_s.deinit();
const ally = ally_s.allocator();
var message = std.ArrayList(u8).init(ally);
defer message.deinit();
while(try readMessageChunk(&message, std.io.getStdIn().reader().any())) {}
std.debug.print("{any}\n", .{message.items});
}
zig build-exe -target x86_64-linux overflow_vuln.zig
$ printf '\x01\x00\x00\x00\x00\x00\x00\x00 \xff\xff\xff\xff\xff\xff\xff\xff' | ./overflow_vuln
printf '\x01\x00\x00\x00\x00\x00\x00\x00 \xff\xff\xff\xff\xff\xff\xff\xff' | ./overflow_vuln
thread 139143 panic: integer overflow
/home/user/Downloads/zig-linux-x86_64-0.12.0-dev.2059+42389cb9c/lib/std/array_list.zig:523:44: 0x102648e in addManyAsSlice (overflow_vuln)
try self.resize(self.items.len + n);
^
/home/user/src/random/overflow_vuln.zig:7:56: 0x102611f in readMessageChunk (overflow_vuln)
try reader.readNoEof(try wip_message.addManyAsSlice(len));
^
/home/user/src/random/overflow_vuln.zig:20:31: 0x102675b in main (overflow_vuln)
while(try readMessageChunk(&message, std.io.getStdIn().reader().any())) {}
^
/home/user/Downloads/zig-linux-x86_64-0.12.0-dev.2059+42389cb9c/lib/std/start.zig:585:37: 0x1025e95 in posixCallMainAndExit (overflow_vuln)
const result = root.main() catch |err| {
^
/home/user/Downloads/zig-linux-x86_64-0.12.0-dev.2059+42389cb9c/lib/std/start.zig:253:5: 0x1025981 in _start (overflow_vuln)
asm volatile (switch (native_arch) {
^
???:?:?: 0x0 in ??? (???)
Aborted
zig build-exe -target x86_64-linux overflow_vuln.zig -O ReleaseFast
$ { printf '\x01\x00\x00\x00\x00\x00\x00\x00 \xff\xff\xff\xff\xff\xff\xff\xff'; while :; do printf '\xAA' ;done } | valgrind ./overflow_vuln
==139188== Memcheck, a memory error detector
==139188== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==139188== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
==139188== Command: ./overflow_vuln
==139188==
==139188== Syscall param read(buf) points to unaddressable byte(s)
==139188== at 0x100C0D6: getErrno (linux.zig:224)
==139188== by 0x100C0D6: read (os.zig:754)
==139188== by 0x100C0D6: read (File.zig:1020)
==139188== by 0x100C0D6: typeErasedReadFn (io.zig:331)
==139188== by 0x100C0D6: read (Reader.zig:10)
==139188== by 0x100C0D6: readAtLeast (Reader.zig:29)
==139188== by 0x100C0D6: readAll (Reader.zig:17)
==139188== by 0x100C0D6: readNoEof (Reader.zig:38)
==139188== by 0x100C0D6: readMessageChunk (overflow_vuln.zig:7)
==139188== by 0x100C0D6: main (overflow_vuln.zig:20)
==139188== by 0x100C0D6: callMain (start.zig:585)
==139188== by 0x100C0D6: initEventLoopAndCallMain (start.zig:519)
==139188== by 0x100C0D6: callMainWithArgs (start.zig:469)
==139188== by 0x100C0D6: start.posixCallMainAndExit (start.zig:425)
==139188== by 0x100BAC1: (below main) (start.zig:253)
==139188== Address 0x4803000 is not stack'd, malloc'd or (recently) free'd
==139188==
==139188== Invalid read of size 8
==139188== at 0x100D2A3: find (treap.zig:158)
==139188== by 0x100D2A3: getEntryFor (treap.zig:80)
==139188== by 0x100D2A3: searchBucket (general_purpose_allocator.zig:552)
==139188== by 0x100D2A3: heap.general_purpose_allocator.GeneralPurposeAllocator(.{.stack_trace_frames = 0, .enable_memory_limit = false, .safety = false, .thread_safe = true, .MutexType = null, .never_unmap = false, .retain_metadata = false, .verbose_log = false}).free (general_purpose_allocator.zig:837)
==139188== by 0x100C43A: rawFree (Allocator.zig:98)
==139188== by 0x100C43A: log2a (Allocator.zig:341)
==139188== by 0x100C43A: free__anon_4426 (Allocator.zig:314)
==139188== by 0x100C43A: deinit (array_list.zig:75)
==139188== by 0x100C43A: main (overflow_vuln.zig:18)
==139188== by 0x100C43A: callMain (start.zig:585)
==139188== by 0x100C43A: initEventLoopAndCallMain (start.zig:519)
==139188== by 0x100C43A: callMainWithArgs (start.zig:469)
==139188== by 0x100C43A: start.posixCallMainAndExit (start.zig:425)
==139188== by 0x100BAC1: (below main) (start.zig:253)
==139188== Address 0xaaaaaaaaaaaaaaaa is not stack'd, malloc'd or (recently) free'd
==139188==
==139188==
==139188== Process terminating with default action of signal 11 (SIGSEGV)
==139188== General Protection Fault
==139188== at 0x100D2A3: find (treap.zig:158)
==139188== by 0x100D2A3: getEntryFor (treap.zig:80)
==139188== by 0x100D2A3: searchBucket (general_purpose_allocator.zig:552)
==139188== by 0x100D2A3: heap.general_purpose_allocator.GeneralPurposeAllocator(.{.stack_trace_frames = 0, .enable_memory_limit = false, .safety = false, .thread_safe = true, .MutexType = null, .never_unmap = false, .retain_metadata = false, .verbose_log = false}).free (general_purpose_allocator.zig:837)
==139188== by 0x100C43A: rawFree (Allocator.zig:98)
==139188== by 0x100C43A: log2a (Allocator.zig:341)
==139188== by 0x100C43A: free__anon_4426 (Allocator.zig:314)
==139188== by 0x100C43A: deinit (array_list.zig:75)
==139188== by 0x100C43A: main (overflow_vuln.zig:18)
==139188== by 0x100C43A: callMain (start.zig:585)
==139188== by 0x100C43A: initEventLoopAndCallMain (start.zig:519)
==139188== by 0x100C43A: callMainWithArgs (start.zig:469)
==139188== by 0x100C43A: start.posixCallMainAndExit (start.zig:425)
==139188== by 0x100BAC1: (below main) (start.zig:253)
==139188==
==139188== HEAP SUMMARY:
==139188== in use at exit: 0 bytes in 0 blocks
==139188== total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==139188==
==139188== All heap blocks were freed -- no leaks are possible
==139188==
==139188== For lists of detected and suppressed errors, rerun with: -s
==139188== ERROR SUMMARY: 3 errors from 2 contexts (suppressed: 0 from 0)
Segmentation fault
Edit: Accidentally posted old logs.
I didn't expect you to be so reasonable, yet here we are. I rest my case.
That example has nothing to do with addOne
(as that's the example this issue puts forwards and one of the changes it makes), the function you demonstrated there (addManyAsSlice
) is indeed one which may need to have a condition as it allows for a user-specified size which can overflow before the precondition of OOM like addOne
. Nice try though.
Btw I'd still say this is fine, it clearly is a bug on the caller side of things as there is no justifiable reason to allocate 2^64 bytes of memory. The assertion catches this bug and alerts the programmer to it, but it is not a case that'd ever be hit in practice with actual functional code (rather just indeed, the result of a math bug, not sanitizing user data, etc). Zig does not guard against such bugs as those are things the developer is responsible to prevent, and the assertions Zig adds to things like integer overflow is one of the tools you can use to locate such issues.
This is indeed one of those "should be OOM in theory but in practice should never be hit" things, hence why it is an assertion which is detectable in runtime safe builds rather than a a proper error condition that performance-impacting instructions would have to be added to handle even in things like ReleaseFast. Tldr: If your code hits this path your code is wrong and should be corrected, the library should not be changed to accommodate such things.
That example has nothing to do with addOne, the function you demonstrated there (addManyAsSlice) is indeed one which may need to have a condition as it allows for a user-specified size which can overflow before the precondition of OOM like addOne. Nice try though.
You never asked about addOne
. You asked about ensureUnusedCapacity
, whose functionality is performed by addManyAsSlice
.
I really thought you grew a spine for a second there. Should've known it was too good to be true, though.
I do have to say, telling me
You did not read my comment.
then admitting that you actually did not read my comment and came to a conclusion before ever reading it on Matrix
has got to be the funniest thing I've read in a while.
I guess I'm blocking you on GitHub as well. It's a blatant waste of everyone's time to pretend to be interested in an argument.
Don't we already have a solution for these kinds of vulnerabilities?
In ReleaseSafe
all integer overflows, including the ones in the user program, lead to a panic.
Should the performance of all ReleaseFast
applications be compromised to essentially get ReleaseSafe
behavior for standard library data structures?
The problem is, this API currently requires the caller to check overflow, or it will be a vulnerability in ReleaseFast
and ReleaseSmall
mode. If the caller doesn't check it, the program can never be used safely in ReleaseFast
/ReleaseSmall
mode, which makes ReleaseFast
/ReleaseSmall
useless.
IMO it's a better API design to check overflow internally, other than relying on the caller to check for overflow with every call.
@IntegratedQuantum, safe std.ArrayList
APIs should never ever be able to trigger illegal behavior, much less a security vulnerability.
Should the performance of all
ReleaseFast
applications be compromised to essentially getReleaseSafe
behavior for standard library data structures?
ReleaseSafe
protects against programmer error, and as per the API contract, there is no programmer error in either of the two examples I have shown:
/// Extends the list by 1 element. Allocates more memory as necessary.
/// Invalidates pointers if additional memory is needed.
for append
, and
/// Resize the array, adding `n` new elements, which have `undefined` values.
/// The return value is a slice pointing to the newly allocated elements.
/// The returned pointer becomes invalid when the list is resized.
/// Resizes list if `self.capacity` is not large enough.
for addManyAsSlice
.
You say this could harm performance in performance-sensitive applications, and that may well be the case, but you are ignoring that if the programmer wants to explicitly ignore the possibility of integer overflow, they can still do that and use ensureTotalCapacity
, perform the unsafe addition themselves (items.len + n
), and use the *AssumeCapacity
functions, whose illegal behavior I have clearly documented in the PR, so that it is obvious that an overflow can't occur. Alternatively, *Unsafe
variants could be added, whether it's worth for such a niche use case that the programmer can fulfill themselves is unclear to me.
I am in agreement with @GalaxySnail, especially
If the caller doesn't check it, the program can never be used safely in ReleaseFast/ReleaseSmall mode, which makes ReleaseFast/ReleaseSmall useless.
I think the solution here is to not use ReleaseFast if the developer cares about this not causing issues. the pros and cons is very well documented
It's unfortunate that this thread has little pokes and jabs at each other because it's otherwise quite an interesting and valuable technical debate. Let's try to transition to a more constructive, trusting tone where we respect each other even while disagreeing.
I do think in general functions that allocate should do any addition or multiplication with overflow and return error.OutOfMemory if the overflow bit is set. I think there is precedence for this in Allocator.alloc().
the pros and cons is very well documented
Well, no, prior to the PR, this surprising behavior (supposedly safe std.ArrayList
functions causing illegal behavior) wasn't documented at all.
https://ziglang.org/documentation/master/#Build-Mode https://ziglang.org/documentation/master/#ReleaseFast
Safety checks disabled
https://ziglang.org/documentation/master/#Undefined-Behavior
Most undefined behavior that cannot be detected at compile-time can be detected at runtime. In these cases, Zig has safety checks.
The ReleaseFast and ReleaseSmall build modes disable all safety checks in order to facilitate optimizations.
https://ziglang.org/documentation/master/#Integer-Overflow
https://ziglang.org/documentation/master/#Primitive-Types
usize
: unsigned pointer sized integer.
You're right, it is documented that integer overflow with +-*
is illegal behavior, but the functions themselves never declare that they can cause illegal behavior. It is a reasonable assumption that it is checked, since the opposite is nowhere stated, and there is precedent in C's calloc
(musl, glibc), where similar assumptions of "can't happen" have led to security vulnerabilities as well, and std.mem.Allocator
.
expecting every function to document exactly where its possible to run into safety checked behavior is impractical imo
expecting every function to document exactly where its possible to run into safety checked behavior is impractical imo
I disagree. I think it's a basic requirement of every package with a public API, including the standard library, to document explicitly in the doc comments of every function, every possible way that illegal behavior can occur via action or inaction of the API user.
Is there really a realistic example where this will be triggered?
The given example of how to trigger it shows an incorrect use of the interface (length should never be directly user controlled) as a motivation towards including this however I don't see any case where this would pop up in actual code outside the constructed example.
It's not realistic on a 64-bit system, but usize can sometimes be 32, 16, or even 8 bit (I think on arduino)
Maybe this check can be removed: https://github.com/ziglang/zig/blob/69461bcae4a78c835bdfe0aae85524342c0f8461/lib/std/array_list.zig#L1069
self.items.len
corresponds to allocated memory, an overflow would imply that at that point the arraylist holds the whole addressable memory.
I can't think of a possible runtime environment where the heap can occupy the complete address space, so we can always assume self.items.len < std.math.maxInt(usize)
.
@erikarvstedt great observation. I agree with your conclusion.
@andrewrk, shall I open a PR?
お願いします
Zig Version
0.12.0-dev.1753+a98d4a66e
Steps to Reproduce and Observed Behavior
Consider the following piece of correct code:
When
array_list.items.len
andarray_list.capacity
arestd.math.maxInt(usize)
, this code leads to an integer overflow instd.ArrayListAlignedUnmanaged.addOne
:Integer overflow is UB and, in ReleaseFast modes, wraps on x86 (
newlen = 0
). Integer overflows like this (CWE-190) have been exploited in the wild.Expected Behavior
Correct usage of a standard library API, like in this case, should never yield incorrect results. Integer overflow is a security risk and must be checked, for example with the functions in
std.math
or the*withOverflow
builtins. Note that this is difficult to trigger with completely correct code on 32-bit Linux, as mmap will refuse to allocate more than1 << 31 - 1
bytes. Will probably make a PR to do this at some point.