rust-lang / nomicon

The Dark Arts of Advanced and Unsafe Rust Programming
https://doc.rust-lang.org/nomicon/
Apache License 2.0
1.74k stars 256 forks source link

Can undefined behavior that is theoretically reachable, but not reached in practice cause problems? #454

Open FeldrinH opened 3 weeks ago

FeldrinH commented 3 weeks ago

This is a question that is in my opinion important when dealing with the risk of undefined behavior, but is currently not clearly adressed in the nomicon (or any other materials I could find online):

If some part of a program contains undefined behavior that is reachable, but is then executed with inputs where that part of the program won't be reached, is the behavior of that specific program execution well defined or not? In other words, is the impact of undefined behavior limited to specific program executions where undefined behavior is invoked or can it affect all possible executions of the program?

For a more concrete example, say I have this program:

use std::{env, hint::unreachable_unchecked};

fn main() {
    let args: Vec<String> = env::args().collect();
    let value = args[1].parse::<i32>().unwrap();
    if value == 0 {
        // Something that caueses undefined behavior here. unreachable_unchecked() is used as an example, 
        // but it could be anything, e.g. dereferencing a dangling pointer or creating two mutable references to the same value.
        unsafe { unreachable_unchecked() };
    }
    println!("Value: {}", value);
}

If I run this program with an argument of 1, is there any risk of undefined behavior in that specific run of the program?

FeldrinH commented 3 weeks ago

PS: The question is motivated in part by this post about the scope of undefined behavior in C.

SOF3 commented 3 weeks ago

afaik unreachable_unchecked() is by definition only UB when it is reached, so this question sounds somewhat different from the following?

if value == 0 {
    unsafe { print(*ptr::dangling::<u8>()); }
}
FeldrinH commented 3 weeks ago

afaik unreachable_unchecked() is by definition only UB when it is reached, so this question sounds somewhat different from the following?

if value == 0 {
    unsafe { print(*ptr::dangling::<u8>()); }
}

Perhaps. I used unreachable_unchecked() in the example simply because it was the most clear and explicit source of undefined behavior I could think of. To be clear, the question is intended to be about the general case, where unreachable_unchecked() might be replaced with an arbitrary operation that causes undefined behavior. I've edited the example to clarify.

scottmcm commented 2 weeks ago

"Theoretically" reachable wouldn't be a workable rule. Just change the example to if unsolved_math_problem(a, b) { hint::unreachable_unchecked() } and it's clear: it's not reasonable to have the correctness of a single execution depend on unknown -- or even unknowable! -- things.

The soundness of something can be unknown if we don't know enough to prove it one way or another, but whether a particular execution is UB only depends on things we can see in that execution.

An execution is UB as soon as it's certain that it will hit one UB action. And yes, as in that SO question it means that it can "time travel" to an extent. In fact, that's necessary for hint::unreachable_unchecked to ever be useful: the whole reason you might use it is to have the compiler optimize away any branches that lead to its block of code.

But it can only time-travel as far as it can be proven that the execution would have hit UB anyway. So your program in the OP is not sound -- as it's possible to hit UB given a certain input -- but there are lots of inputs for which it doesn't trigger UB, as you could confirm with MIRI.

TL/DR: "Soundness" is about all possible inputs. "Triggers UB" is about a particular execution.