rust-lang / miri

An interpreter for Rust's mid-level intermediate representation
Apache License 2.0
4.34k stars 332 forks source link

Always use `-Z symbolic-alignment-checks` for *unleaked* allocations #1988

Open JakobDegen opened 2 years ago

JakobDegen commented 2 years ago

-Z symbolic-alignment-checks currently suffers from false positives; it is implemented by checking that both the allocation alignment and the offset within the allocation are sufficiently aligned, and then reporting UB if they are not. This fails to consider the possibility of code having explicitly checked the alignment via pointer arithmetic (such a failure is of course to be expected from symbolic checks).

However, as far as I can tell, this is not a problem for allocations for which the address has not been leaked. On such allocations, at the time of the potentially UB access, we can always choose the address to be one such that the access is unaligned if the symbolic check would be unsuccessful. Note that this does not require us to fix that choice though: We can "change our mind" later by just claiming that the earlier access was UB. In principle this might lead to worries about reporting UB too late (ie later making a decision that would have made the earlier access UB), but I think with more careful reasoning this is actually a non-issue.

@saethlin brought up on Discord that how much this actually helps is limited by the frequency of ptr to int casts in most code. This is a valid concern, but I'm hopeful that this can be addressed over time. After, https://github.com/rust-lang/rust/pull/92686 , most of the ptr to int casts in core::ptr seem to be in code that I assume is not frequently hit. Whether other code (in std or common libraries) does this remains to be seen, but in any case I don't think that should be a blocker - this seems like a strict improvement in at least those cases where it is applicable.

RalfJung commented 2 years ago

That's an interesting idea!

My main concern is that there is no clear single point in Miri where we know that an address has been "leaked" -- and in particular some of my recent refactorings made this even harder. That was a deliberate choice to simplify the interpreter. So there is a non-negligible maintenance overhead to ensuring that we properly detect all ways of "leaking" an address, and to ensure that as Miri evolves no gaps are introduced in this check. Combined with the fact that it is not clear how many actual bugs this approach would find (compared to simpler alternatives discussed in https://github.com/rust-lang/miri/pull/1990), I am not sure if that effort is worth the gains.

RalfJung commented 2 years ago

After, https://github.com/rust-lang/rust/pull/92686 , most of the ptr to int casts in core::ptr seem to be in code that I assume is not frequently hit.

I am confused by this comment, that PR does not seem to make much of a difference in terms of ptr to int casts?

saethlin commented 2 years ago

Don't we know that a pointer has been leaked if we have computed an address for it in ptrtoint.rs? I think the proposal is to stuff additional state in Pointer and turn that const that const eval branches on into a method.

RalfJung commented 2 years ago

No, we eagerly compute addresses for all allocations. We have to, because the representation of a Pointer (in Miri, not CTFE) these days is its absolute address, paired up with the AllocId it should point into. So we need a base address to represent pointers to an allocation, even if those pointers are never cast to integers. This refactoring has been a long time coming and I am quite happy we did it, see https://github.com/rust-lang/miri/issues/841 and https://github.com/rust-lang/rust/pull/87123 for details.