Closed RalfJung closed 1 year ago
We cannot (and do not want to) track integers, so we have to somehow mark memory as "integer-available" when casting to int, and then use that when casting back from an int.
When getting a raw pointer as argument it would be unknown if the raw pointer is "integer-available".
When getting a raw pointer as argument it would be unknown if the raw pointer is "integer-available".
Sure. But that's fine. This is something optimizations have to live with, but that is likewise true in C/C++.
LLVM IR has some bugs in this area, but obviously there's no way to design a semantics for a buggy optimizer. So the only real guidance we have is to allow as little as possible but as much as needed.
In #263 (now closed because it duplicates this), I mentioned that the current model is suprising to me, and at issue with the working model I have for lccc (which is incredibly desireable because of the optimizations it enables, a small example of which is in that issue). lccc tracks raw pointers exactly the same as it tracks references, because it treats all pointers (of which references and raw pointers are both subsets) the same.
As for int->pointer casts, that is fun, but the rule that C++ (and thus lccc) has is that if you (reinterpret_
)cast a pointer to an integer, and (reinterpret_
)cast that integer back to a pointer type, you get the same pointer, provenence and all (if multiple pointers were cast to the same integer, you get whichever one gives the code defined behaviour, if possible).
if multiple pointers were cast to the same integer, you get whichever one gives the code defined behaviour, if possible
@chorman0773 what happens if multiple pointers give the code defined behavior, is that not possible?
what happens if multiple pointers give the code defined behavior, is that not possible?
In C++, at least, that's not possible. The only way this would work is if you had multiple objects of the same type at that address, which is not possible. In rust, I also don't think it's possible, necessarily. However, the actual chosen one would be unspecified (but note that in all cases, this is somewhat oversimplified).
There isn't really any C++ rule on how provenance works. The C and C++ committees are progressing towards adopting one, but this is a recent development, and the current specs say very little about it, leaving it unclear whether integers have provenance and even whether provenance exists at all. Meanwhile, among major C++ compilers, at least GCC and Clang have open bugs regarding how exactly pointer<->integer conversions should be optimized. That's part of what makes coming up with rules for Rust so difficult, especially since Rust is somewhat bound to whatever memory model LLVM ends up with.
Would something along these lines make any sense? (Assuming no issues with e.g. LLVM)
size_of::<T>
bytes, does not point within the intersection of the memory region of provenance of the pointer and the union of: the memory regions of provenance of all pointers (except those with no region of provenance) and references accessible from that point in the code, the memory used to store each static
(and leaked boxes?) static mut
s or UnsafeCell
s?)size_of::<T>
bytes at/after the addressA few corollaries:
size_of::<T>
bytes at/after the addressNote: I'm not very familiar with Stacked Borrows, so I'm not sure of the details of how that would work with this.
ptr->int->ptr expanding provenance is one of the things I do not want. The code I showed in the linked issue was an exposition as to why that would be undesirable. I'd be fine with it destroying provenance, or allowing the previous provenance of the pointer to pop up elsewhere (provided the borrow item still exists), but allowing it to extend without bound negates every other rule in whatever list you may use.
FWIW, here is how I would analyze the does_something
function you mentioned in #263, as called by either do_interesting_things
, in a way that makes it UB:
pub unsafe extern "C" fn does_something(
ptr: *mut usize // memory region of provenance: array[0]
){
let ptr = std::transmute( // memory region of provenance: none
std::transmute<usize>( // memory region of provenance: N/A
ptr // memory region of provenance: array[0]
)
) as *mut Interesting;
// <see below>
(*ptr).exclusive_access = ptr as *mut ();
}
This write implies a pointer-to-reference conversion, which has the following preconditions:
ptr
argument, which has a provenance region of array[0]
.ptr
is not included since it has no provenance region. (I have clarified my previous reply to state this, sorry.)Interesting
struct at ptr
is not accessible, thereby triggering UB.ptr->int->ptr expanding provenance is one of the things I do not want
I think this is impossible to avoid. Specifically, the only way to avoid it is to give integers provenance, and that has all sorts of even-less-desirable consequences, or to give pointers no provenance, and that kills most of the alias analyses that LLVM, GCC and likely also lccc do.
I am not saying I like the fact that ptr->int->ptr is not a NOP. But this is an unfortunate consequence of a bunch of other properties that it would be even more problematic to lose.
Basically, provenance and integer-ptr-casts are mutually incompatible. Most languages only have one: Java, ML and other high-level languages have provenance but no integer-ptr-casts; assembly has integer-ptr-casts (well it doesn't even need a cast, there simply is no distinction) but it does not have provenance. If you want to have them both in the same language, there will be suffering and non-intuitive behavior. I believe this to be unavoidable, and I will soon (finally) publish the blog post laying down this argument in more detail.
I think this is impossible to avoid
Perhaps not. If converting back to a pointer is defined to assert provenance, I think there is a way to get that information. It can be pulled from the borrow stack. If there is an item for the location that has been converted to an integer, that item is used (do not pop items from the borrow stack in this search). Otherwise, the resulting pointer is invalid/dangling and cannot be dereferenced. Alternatively, it could be called a "deferred" pointer, which has similar semantics to untracked raw pointers, which, when dereferenced/reborrowed, it looks for a valid borrow item that is usable, and has been converted to an integer. In the former case, it wouldn't care about the actual size, in the latter it would. This may not be a perfect specification, but it may be enough to save provenance of pointers converted to integers.
Perhaps not. If converting back to a pointer is defined to assert provenance, I think there is a way to get that information. It can be pulled from the borrow stack.
There may be multiple items on the stack that could be used to get the provenance. Also you may do something like ((ptr as *const u8) as usize + 4) as *const u8
. The resulting pointer should have the same provenance as ptr
, but the way you propose, it may accidentally get the provenance of a pointer to a value 4 bytes after the value at ptr
.
The resulting pointer should have the same provenance as ptr, but the way you propose, it may accidentally get the provenance of a pointer to a value 4 bytes after the value at ptr.
Ah, yeah, that's true. Arithmetic into int->ptr casts are fun. Would this still be an issue with the deffered version, then?
that kills most of the alias analyses that LLVM, GCC and likely also lccc do
The problem is that by having the ptr->int->ptr round trip extend provenance, it also similarily kills such analysis, as I mentioned. After all, black-box code could, as I demonstrated in the linked issue, perform this round trip, and effectively "launder" the pointer, but without even the implications of C++'s std::launder
, which prevents this exact issue. This would mean that, at the very least, you can invent any pointer you want within the same allocation, merely by casting a pointer to an integer, and then back.
My proposed semantics would make it UB for a function (i.e. the black-box code) accepting just a pointer to use a pointer or create a reference with a wider provenance region, due to the "memory accessible from that point in the code" rule. (I have changed the rule for static
s to exclude static mut
s and UnsafeCell
s, which would otherwise be a loophole.)
Actually, that might work well then.
The problem is that by having the ptr->int->ptr round trip extend provenance, it also similarily kills such analysis, as I mentioned.
No, it doesn't kill it in any comparable way. In particular, Stacked Borrows still works. While we did those proofs in a semantics without int-ptr-casts, but we did deliberately make it so that reference -> raw-ptr -> reference "extends provenance", precisely to show this point. Raw pointers have no provenance, serving as a "model for integers".
So, we have a proof that very strong alias analysis is possible even when there are provenance-extending roundtrips in the language. We have not done the proof if the presence of int-ptr-casts, but I see no reason why they would make this any more difficult.
It does, however, mean that ptr-to-int casts cannot be dead-code eliminated without potentially introducing UB. If we take @chorman0773's original example and change it to a ptr-to-int cast rather than just reference-to-raw-ptr (since we assume raw pointers will have provenance in the future), we're talking about a stray
&mut ptr.exclusive_access as *mut _ as usize;
where the result of the ptr-to-int conversion just gets thrown out. This function containing it goes on to call another function, passing it a reference to a different field adjacent to exclusive_address
. The second function can then convert the reference to usize
and adjust the offset based on the struct layout, to end up with a value that must be equal to the usize
produced by that dead ptr-to-int cast. Then they can cast back to a pointer, and since the ptr-to-int cast made exclusive_access
"integer-available", they are now allowed to access exclusive_access
.
This is annoying. But I think C++ has the exact same problem, under the Cerberus PNVI model that was identified as by WG21 the most promising path forward (see my last comment). From a C++ perspective, the exclusive_access
test case does require making assumptions about the implementation-defined behavior of ptr-to-int casts (since the spec doesn't guarantee that e.g. adjacent addresses will convert to adjacent integers), but implementations that do make those guarantees (i.e. essentially all of them) have to uphold it in this case.
Note that if the second function were just guessing the address, the Cerberus paper's model of nondeterminism would justify making it UB even if the guess happened to be right. This is interesting, and probably worth discussing in a Rust context. However, that doesn't apply to @chorman0773's example because it has a way to compute the address.
From a C++ perspective, the exclusive_access test case does require making assumptions about the implementation-defined behavior of ptr-to-int casts
It would not, actually. The equivalent C++ code would have undefined behaviour (strict aliasing) by the round trip rule, then the final reinterpret_cast would yield the same pointer, to array[0], because pointer-interconvertibility is stopped by arrays (which was pointed out in my original, that an array is used to prevent reachability from leaking out in lccc), and usize is not similar to Interesting.
Without necessarily advocating for this. I don't entirely follow the argument for why giving integers provenance is a fatal problem.
A drawback of this model—fatal in practice—is that it blocks many integer optimizations, such as propagation of equalities as done by, e.g., global value numbering (GVN) or range analysis. For example, transforming “(a == b) ? a : b” into “b” is incorrect in this model
This is only true if all optimizations operate on the same model. Couldn't you have two optimizations passes: on the first pass, everything has provenance, and you are indeed unable to optimize integer (or pointer) expressions such as the example.
Then you erase all provenance and do a second set of lower-level optimizations. This would allow (a == b) ? a : b
to be optimized even if a
and b
are pointers, which is pretty desirable anyway...
We want to push many optimizations to MIR, but the provenance must still be preserved when lowering to LLVM-ir, so those optimizations would only be possible by LLVM in that case. Also a reloading pass after regalloc could benefit from provenance too I think. After regalloc it is way too late to perform GVN and similar optimizations.
From a C++ perspective, the exclusive_access test case does require making assumptions about the implementation-defined behavior of ptr-to-int casts
It would not, actually. The equivalent C++ code would have undefined behaviour (strict aliasing) by the round trip rule, then the final reinterpret_cast would yield the same pointer, to array[0], because pointer-interconvertibility is stopped by arrays (which was pointed out in my original, that an array is used to prevent reachability from leaking out in lccc), and usize is not similar to Interesting.
You might be missing that I modified your example to convert &mut ptr.exclusive_access
to an integer (under the assumption that Rust raw pointers will carry provenance in the future, so what matters is integers). does_something
can then convert &mut ptr.array[0]
to an integer and add a fixed offset with integer arithmetic, which, assuming implementation-defined guarantees about the pointer<->integer mapping, is guaranteed to produce an integer equal to what &mut ptr.exclusive_access
converted to. Assuming integers do not carry provenance, the fact that they're equal means they're equivalent, so converting that integer to a pointer is a round-trip that will recover &mut ptr.exclusive_access
.
You might be missing that I modified your example to convert &mut ptr.exclusive_access to an integer
The original code was a double transmute on ptr, which is equivalent to effectively reinterpret_cast<Interesting*>(std::bit_cast<Interesting*>(std::bit_cast<std::uintptr_t>(ptr)))
, which is a pointer to array[0]
(or it may be something else entirely, probably an invalid pointer, because bit_cast is under-defined in what transformations it guarantees). The content of do_interesting_things
is moot in C++, since this conversion is the most favourable one, and would not give the rest of the function defined behaviour.
Sorry, I neglected to include my version of does_something
(though I did describe it in prose) since the topic was what optimizations are possible in do_interesting_things
. To avoid further confusion, here is my modified example in full, converted to C++. The point of the example is that does_something
is well-defined only because the line marked // unused
is there. If it were not there, does_something
could not legally access exclusive_access
under any circumstances; therefore, a compiler (assumed to be optimizing do_interesting_things
without knowing the implementation of does_something
) could optimize away the line marked // store
. Therefore, removing the // unused
line as dead code without further consideration (which typical compilers do) and removing the store (which typical compilers don't do) are mutually exclusive optimizations.
#include <stddef.h>
#include <stdint.h>
struct Interesting {
size_t array[1];
void *exclusive_access;
};
__attribute__((noinline, noipa))
void does_something(size_t *array_ptr) {
uintptr_t array_ptr_i = reinterpret_cast<uintptr_t>(array_ptr);
uintptr_t exclusive_access_ptr_i = array_ptr_i
- offsetof(Interesting, array) // for clarity, though this is guaranteed to be 0
+ offsetof(Interesting, exclusive_access);
void **exclusive_access_ptr = reinterpret_cast<void **>(exclusive_access_ptr_i);
*exclusive_access_ptr = nullptr;
}
void do_interesting_things(Interesting *ptr) {
void *x = ptr->exclusive_access;
reinterpret_cast<uintptr_t>(&ptr->exclusive_access); // unused
does_something(&ptr->array[0]);
ptr->exclusive_access = x; // store
}
We want to push many optimizations to MIR, but the provenance must still be preserved when lowering to LLVM-ir, so those optimizations would only be possible by LLVM in that case. Also a reloading pass after regalloc could benefit from provenance too I think. After regalloc it is way too late to perform GVN and similar optimizations.
I can see that having two distinct phases would be limiting. You could avoid that by having both pointer representations in the IR: when you apply certain optimizations (such as propagation of equalities) you also convert the pointers to the form without provenance.
I can see that having two distinct phases would be limiting. You could avoid that by having both pointer representations in the IR: when you apply certain optimizations (such as propagation of equalities) you also convert the pointers to the form without provenance.
This strikes me as similar to arithmetic optimizations with undefined overflow. E.g. you have something like
int foo(int a, int b) {
return a ? a * (b + 1) : 0;
}
and optimize it to
return a * (b + 1);
but now foo(0, INT_MAX)
is undefined, so instead you have to optimize it to something like
return a * ((wrapping_int)b + 1);
with wrapping_int
or wrapping_add
or something in your IR.
Anyway I suspect this trick doesn't work as well with pointers, because the presence of a single unbounded provenance pointer can interfere with optimizations on all other pointers, since they might alias. But I'd be curious to work something like this out and see how much optimization you can keep.
Anyway I suspect this trick doesn't work as well with pointers, because the presence of a single unbounded provenance pointer can interfere with optimizations on all other pointers, since they might alias. But I'd be curious to work something like this out and see how much optimization you can keep.
I don't think this is necessarily the case; at least in stacked borrows you can definitely have &mut
references that do not alias with anything else, including raw pointers with wildcard provenance. You need a cooperation between both parties (the pointer needs wildcard provenance, and the reference has to have a raw borrow on the stack, as e.g. caused by a &mut -> *mut
conversion) for aliasing to be possible.
It does, however, mean that ptr-to-int casts cannot be dead-code eliminated without potentially introducing UB.
Yes. That is a consequence of any semantics where ptr-to-int casts have a "broadcast" side-effect, such as this early and simple model of ptr-int-casts.
FWIW, all the possible formalizations I know of for restrict
have the same problem.
This is not great, I admit. It's just the best I know how to do.
Note that if the second function were just guessing the address, the Cerberus paper's model of nondeterminism would justify making it UB even if the guess happened to be right.
The details of this are tricky because you have to prove that guessing is even required -- since memory is finite, one can construct a situation where the non-determinism is constrained to a single possibility and then there is no UB. But I digress.
This is only true if all optimizations operate on the same model. Couldn't you have two optimizations passes: on the first pass, everything has provenance, and you are indeed unable to optimize integer (or pointer) expressions such as the example.
Then you erase all provenance and do a second set of lower-level optimizations. This would allow (a == b) ? a : b to be optimized even if a and b are pointers, which is pretty desirable anyway...
This works in principle, yes -- we can have two IRs with (almost) the same syntax and different semantics. This also causes problems, though. On top of what @bjorn3 said, we also need to be really careful to never apply IR1 optimizations to IR2 programs.
I can see that having two distinct phases would be limiting. You could avoid that by having both pointer representations in the IR: when you apply certain optimizations (such as propagation of equalities) you also convert the pointers to the form without provenance.
That doesn't help, since when we optimize some function foo
that takes pointers as input, we have no way to tell in which representation they are.
The mere possibility of a ptr/int with provenance already breaks optimizations. This is similar to how the mere existence of undef
breaks some optimizations. The reason for this is that optimizations are "forall"-type statements -- "for all inputs to the function, the optimized function behaves the same as the unoptimized". When you add more "things" to your semantics, suddenly the "forall" quantifies over more things, and so some optimizations break.
That doesn't help, since when we optimize some function foo that takes pointers as input, we have no way to tell in which representation they are.
I see, yeah that would be a problem.
You could still avoid two IRs by modelling them as different types within the same IR. Initially all pointers have type PtrWithProvencance
, and then one of the lowering passes converts all of these to RawPtr
.
On top of what @bjorn3 said, we also need to be really careful to never apply IR1 optimizations to IR2 programs.
If there are separate pointer types, then you can have optimizations defined generically. eg. if my optimization depends on "equality" of values, then it will automatically work correctly both before and after the lowering step, because PtrWithProvenance
and RawPtr
have different definitions of equality.
You could still avoid two IRs by modelling them as different types within the same IR. Initially all pointers have type PtrWithProvencance, and then one of the lowering passes converts all of these to RawPtr.
And then you need to make it UB to load a ptr-with-provenance at type PtrWithoutProvenance
. But yes, that can work.
The thing is that you may still want to have pointer provenance after regalloc, which would likely run after your proposed lowering step, to implement reloading.
A few years ago, I wrote a garbage collector for the (abandoned) RustyScheme project. When writing it, I heavily relied on the results of int → ptr casts having unbounded provenance.
Specifically, there were many cases where the code knows that an object exists at some address X, but there is absolutely know way that the compiler could know this. Furthermore, for performance reasons, objects were represented as thin pointers, and their actual size was stored in the objects themselves. So being able to create a pointer from nowhere was critical.
Similarly, in OS development, it is common to need to cast seemingly-arbitrary numbers to pointers and then dereference them. MMIO is just one example.
When writing it, I heavily relied on the results of int → ptr casts having unbounded provenance.
I am not sure what exactly you mean by "unbounded", but in the most general interpretation, this is fundamentally incompatible with any optimizations based on aliasing guarantees established by references (and, likewise, it is fundamentally incompatible with C's restrict
). When someone uses a mutable reference, there must not be any way to legally alias with that reference, or else we might just not have any aliasing requirements at all.
A concrete example (ideally in code) of what you mean by "unbounded provenance" would help.
It wouldn't just be that. All optimizations based on provenance die in the face of an operation that can increase it without restriction. This is why C++17s std::launder has a reachability requirement, so as to not fundamentally undermine all optimizations relying on this. After all, provenance is not necessary to reason about local (analyzed) code, but is necessary to reason about other (unanalyzed) code.
On Tue, Dec 22, 2020 at 06:56 Ralf Jung notifications@github.com wrote:
When writing it, I heavily relied on the results of int → ptr casts having unbounded provenance.
I am not sure what exactly you mean by "unbounded", but in the most general interpretation, this is fundamentally incompatible with any optimizations based on aliasing guarantees established by references (and, likewise, it is fundamentally incompatible with C's restrict). When someone uses a mutable reference, there must not be any way to legally alias with that reference, or else we might just not have any aliasing requirements at all.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/rust-lang/unsafe-code-guidelines/issues/248#issuecomment-749504667, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABGLD24TIUIVYN4ZVI2FV3TSWCCQBANCNFSM4QL4MXLQ .
Presumably you could cast all pointers returned by your GC's allocation function to an integer before returning them, broadcasting the fact that it exists at that address.
When writing it, I heavily relied on the results of int → ptr casts having unbounded provenance.
I am not sure what exactly you mean by "unbounded", but in the most general interpretation, this is fundamentally incompatible with any optimizations based on aliasing guarantees established by references (and, likewise, it is fundamentally incompatible with C's
restrict
). When someone uses a mutable reference, there must not be any way to legally alias with that reference, or else we might just not have any aliasing requirements at all.A concrete example (ideally in code) of what you mean by "unbounded provenance" would help.
The garbage collected heap is a sea of aliasing, mutable pointers. Pointers are often tagged, and any object can (to a good approximation) point to any other object. Furthermore, some GC algorithms need to be able to go from one object to the object after it, even though there are no pointers from one to the other! That’s what I meant by “unbounded provenance”: I need to be able to take a usize
, mask off its tag bits, and cast it into a pointer that is valid for the entire heap. I can, however, promise that there are no references into the GC heap at all, except for transient ones when dereferencing raw pointers.
It’s been years since I last touched that code, but I believe the situation is something like:
/// A Scheme value
#[repr(C)]
struct Value {
ptr: Cell<usize>;
}
#[repr(C)]
struct ConsBody {
header: Cell<usize>,
head: Value,
tail: Value,
}
impl Value {
/// If `self` is a cons cell, returns it. Otherwise, returns `Err`.
///
/// # Safety
///
/// `self` must be rooted for the GC.
unsafe fn check_cons(&self) -> Result<&ConsBody, ()> {
if (self.ptr & 0x7) == CONS_TAG {
Ok(self.ptr as *const ConsBody as &ConsBody)
} else {
Err(())
}
}
/// Get the size of this object in the heap.
///
/// # Safety
///
/// `self` must be rooted for the GC.
unsafe fn size_of(&self) -> usize {
// low-level stuff here
}
/// Get the next object in the heap, or `None` if this is the last one.
///
/// # Safety
///
/// `self` must be rooted for the GC, and `heap_end` must be the actual end of the heap.
unsafe fn next_object(&self, heap_end: usize) -> Option<&Value> {
// more low-level stuff here
}
}
If this were C or C++, I would just give up and use -fno-strict-aliasing
, which is what the OCaml runtime actually does.
That’s what I meant by “unbounded provenance”: I need to be able to take a usize, mask off its tag bits, and cast it into a pointer that is valid for the entire heap.
The idea is that you should get a pointer that is "at least as valid" as all the ptrs that were before cast to an integer and yielded this particular integer. However, the details of this have not been properly worked out yet in the context of either Stacked Borrows or restrict
.
except for transient ones when dereferencing raw pointers.
That sounds like a potential problem, but I assume that could be fixed once https://github.com/rust-lang/rust/issues/73394 is stabilized?
Interesting case where I don't see a viable way to avoid losing provenance info:
A common way to implement an adaptive mutex is some kind of tagged pointer scheme such that the low bits contain the lock state, and the high bits contain a pointer to a waitset no if there are waiters (or something).
In general I think tagged pointers are tricky for this discussion but I think you can make it work (by using unintuitive code). Something like:
p.cast::<u8>().wrapping_add(flags).cast::<T>()
.p.cast::<u8>().wrapping_sub(p as usize & FLAGMASK).cast::<T>()
.(I'm assuming this does indeed keep provenance as it needs to be, and also seems like a good idea — neither of which I'm certain about)
Sadly, this is about as far as I can go with it. Unfortunately, I need to keep this state in an atomic variable. AtomicPtr doesn't have fetch_and/fetch_or/etc, AtomicUsize does. I could cast &AtomicPtr<T>
to &AtomicUsize
just to do the op, but this seems like it would lose the provenance right then and there, which defeats the point.
Giving these up isn't super viable always (altho for some algos it may be), so this probably makes the decision for me here, as the alternative is a cas loop, which is terrible for perf.
Anyway, this seems like a unconsidered problem case¹, so i thought it might be good to record.
¹ But to be clear not at all a rare one. Lock free code often has to do stuff like this, where pointers and integers are packed together, especially on 64 bit (but on 32 bit too, as 64 bit atomics are more costly on some 32 bit arches). Now that I hear miri has a sexy new data race detector, I guess it will be worth actually testing that part of the code under miri more aggressively.
This seems related to https://github.com/crossbeam-rs/crossbeam/pull/490.
Indeed while sequential code can implement "provenance-preserving pointer tagging" in various ways, doing the same atomically does not seem possible without a CAS loop. It is not hard to specify atomic fetch_or
and friends for pointers (with the mask being an integer, just preserve the provenance of the pointer operand), the question is whether that has anything to do with how LLVM thinks about these operations...
This is resolved now, Stacked Borrows tracks raw pointers with individual tags. Integer-pointer casts are nominally handled via angelic choice; this plan has some problems but it seems unlikely that we will want to fix those with the heavy hammer of untagged raw pointers.
Currently, Stacked Borrows teats all raw pointers as identical -- if any raw pointer is allowed to access some memory, then all are. This does not reflect LLVM's pointer treatment though, and it leads to confusing behavior (e.g. https://github.com/rust-lang/miri/issues/1526).
I have long planned to make tracking more precise, but there's a big open question in how to handle integer-pointer casts. We cannot (and do not want to) track integers, so we have to somehow mark memory as "integer-available" when casting to int, and then use that when casting back from an int.