cognitive-engineering-lab / rust-book

The Rust Programming Language: Experimental Edition
https://rust-book.cs.brown.edu
Other
509 stars 83 forks source link

Question 2 in 6.4 has incorrect / confusing answers #33

Closed CryZe closed 22 hours ago

CryZe commented 1 year ago

URL to the section(s) of the book with this problem: https://rust-book.cs.brown.edu/ch06-04-inventory.html#the-quiz

Description of the problem:

Every time you invoke the make_separator function with an empty string you invoke undefined behavior, even if you don't actually print / use the return value.

Playground

fn make_separator(user_str: &str) -> &str {
    if user_str == "" {
        let default = "=".repeat(10);
        unsafe { &*std::ptr::addr_of!(default) }
    } else {
        user_str
    }
}

fn main() {
    make_separator("");
}

Miri Output:

error: Undefined Behavior: constructing invalid value: encountered a dangling reference (use-after-free)
  --> src/main.rs:11:5
   |
11 |     make_separator("");
   |     ^^^^^^^^^^^^^^^^^^ constructing invalid value: encountered a dangling reference (use-after-free)
   |
   = help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
   = help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
   = note: BACKTRACE:
   = note: inside `main` at src/main.rs:11:5: 11:23

note: some details are omitted, run with `MIRIFLAGS=-Zmiri-backtrace=full` for a verbose backtrace

Suggested fix:

While the question is good for figuring out if the learner properly learned about dangling references to locals, it's distracting to differentiate between instant and late Undefined Behavior. It's probably best for these confusing answers to be removed.

Side notes:

Both this and question 5 have this passage:

Which (if any) of the following programs would (1) pass the compiler

which is also really confusing considering that the whole question is under the assumption that the:

compiler did NOT reject this function

Additionally I find it weird that we have to imagine an imaginary compiler where the code is not rejected, as it's really unclear how similar this imaginary compiler even is to the real compiler. Technically such an imaginary compiler could make this whole function not be Undefined Behavior / cause any memory violations. It honestly would be better to remove this whole confusing phrasing and just use the actual unsafe code that makes the code compile. Then no imaginary compiler is necessary, just the real compiler with all the behavior being in tact.

willcrichton commented 1 year ago

I think that's technically a false positive from Miri. If the output of make_separator("") is not used, then in theory the compiler could ignore the return value, and so a dangling pointer would never actually exist. I suspect it's that the MIR is assigning an unused temporary to the return value, and so Miri is observing that the unused temporary holds a dangling pointer. But that's not a useful distinction from a teaching perspective.

Additionally I find it weird that we have to imagine an imaginary compiler where the code is not rejected, as it's really unclear how similar this imaginary compiler even is to the real compiler.

The borrow checker does not affect the program's semantics (i.e. its output is not used to change the program in any way), so there is a straightforward interpretation of all borrow-check-failing programs. We could use unsafe code, but that concept isn't covered in detail in the early parts of the Rust book.

In general, my goal is to establish a clear mental model for how Rust would execute, borrow checker or not, in Chapter 4. Then people can answer these kinds of hypothetical what-if-it-compiled questions.

konnorandrews commented 10 months ago

I would argue that Miri is correct and this is real UB. My argument is that it is actually observable.

Here is an example of how an invalid borrow causes UB (difference in debug vs release) without it ever actually being dereferenced logically. https://play.rust-lang.org/?version=stable&mode=release&edition=2021&gist=0b14c68c16e1da34a0fa7432f3f92f9d

Both the reference and the nomicon list invalid borrows existing as being UB by themselves.

Question 2 assumes this restriction on borrows doesn't exist, but it doesn't say that is an assumption. The question would be fine if somewhere it said something along the lines of "Assume borrows are just pointers". However, this assumption doesn't apply to Rust itself which the question seems to imply.

willcrichton commented 10 months ago

The point of this exercise is to get readers to understand the problematic relationship between invalidating something and using it. (I'm not sure the example really defeats the point, since it still uses the dangling pointer by passing it to a function, it just doesn't dereference the dangling pointer.)

The subtlety is it's not just that an invalid borrow exists. It's specifically that's returned from a function. For example, this is not UB:

let x: Box<i32> = Box::new(1);
let y: &i32 = &*x;
drop(x);

Even though a borrow is invalidated, it's not UB because y is never "produced" in the technical meaning used in the reference. Explaining the difference between this situation and the function return situation is far outside the scope of the book, so it's easier to give readers the simpler mental model "using invalidated memory is UB." Our goal is just to get people to understand the borrow checker, not to have them write complex unsafe code.

Nonetheless, it would be ideal to have an example where the book and the spec are in full agreement, so I'll give it some thought.