unicode-org / icu4x

Solving i18n for client-side and resource-constrained environments.
https://icu4x.unicode.org
Other
1.34k stars 174 forks source link

Fix Stacked Borrows violations in zerovec #3510

Open Manishearth opened 1 year ago

Manishearth commented 1 year ago

Stacked borrows is an experimental safety model for Rust, and miri runs it. There are still some rough edges so it's okay to fail a Stacked Borrows run, but it's worth starting to look at them now because there may be upstream fixes required. The model may also be replaced by Tree Borrows though I suspect it'll have similar implications for us.

In zerovec we have a ton of violations in the miri run (cargo +nightly miri test --all-features --no-fail-fast to see all the failures)

A bunch of them do not seem fixable (yet) and I plan to ask upstream. I'll document some of them here first.

cc @sffc

Manishearth commented 1 year ago

An interesting one is the first one (and you see this failure in a bunch of places)

```text test flexzerovec::owned::test::test_basic ... --- STDERR: zerovec flexzerovec::owned::test::test_basic --- error: Undefined Behavior: trying to retag from <193830> for SharedReadOnly permission at alloc76231[0x0], but that tag does not exist in the borrow stack for this location --> src/flexzerovec/slice.rs:141:9 | 141 | &*(remainder as *const [u8] as *const Self) | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | | | trying to retag from <193830> for SharedReadOnly permission at alloc76231[0x0], but that tag does not exist in the borrow stack for this location | this error occurs as part of retag at alloc76231[0x0..0x1] | = help: this indicates a potential bug in the program: it performed an invalid operation, but the Stacked Borrows rules it violated are still experimental = help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information help: <193830> would have been created here, but this is a zero-size retag ([0x0..0x0]) so the tag in question does not exist anywhere --> src/flexzerovec/slice.rs:141:12 | 141 | &*(remainder as *const [u8] as *const Self) | ^^^^^^^^^ = note: BACKTRACE (of the first span): ```

The problem is, here, we are trying to construct a pointer-to-DST (&FlexZeroSlice) from an &[u8] slice. The FlexZeroSlice pointer has a metadata value of one less than the slice it is constructed from (since it starts with a u8 header field). The only stable way we have to construct such a slice is to cast the pointer: on nightly we can use std::ptr::from_raw_parts(). However, to do so we end up going through an intermediary of an &[u8] type with the correct pointer but one less length, which for empty FZSes will be an empty slice. Apparently Stacked Borrows doesn't allow retagging empty slices that way.

Manishearth commented 1 year ago

I'm going to stash my progress in https://github.com/unicode-org/icu4x/pull/3513

Manishearth commented 1 year ago

With https://github.com/unicode-org/icu4x/pull/3515 and the MSRV WIP in #3513, we're down to 32 failures (we started at around 70)

There's a bunch of weird ones in the VZV mutation code, there's also this one in the hashmap code

``` test hashmap::serde::test::test_bincode_zhm ... error: Undefined Behavior: out-of-bounds pointer arithmetic: alloc409585 has size 4, so pointer to 4 bytes starting at offset -4 is out-of-bounds --> /home/manishearth/.cargo/registry/src/index.crates.io-6f17d22bba15001f/t1ha-0.1.0/src/bits.rs:106:25 | 106 | Self::fetch(p.sub(offset as usize)).wrapping_shr(shift as u32) | ^^^^^^^^^^^^^^^^^^^^^^ out-of-bounds pointer arithmetic: alloc409585 has size 4, so pointer to 4 bytes starting at offset -4 is out-of-bounds | = help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior = help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information = note: BACKTRACE: = note: inside ` as t1ha::bits::MemoryModel>::tail::` at /home/manishearth/.cargo/registry/src/index.crates.io-6f17d22bba15001f/t1ha-0.1.0/src/bits.rs:106:25: 106:47 = note: inside `t1ha::t1ha1::t1h1_body::>` at /home/manishearth/.cargo/registry/src/index.crates.io-6f17d22bba15001f/t1ha-0.1.0/src/t1ha1.rs:107:38: 107:62 = note: inside `t1ha::t1ha1_le` at /home/manishearth/.cargo/registry/src/index.crates.io-6f17d22bba15001f/t1ha-0.1.0/src/t1ha1.rs:28:18: 28:69 = note: inside `t1ha::t1ha0` at /home/manishearth/.cargo/registry/src/index.crates.io-6f17d22bba15001f/t1ha-0.1.0/src/lib.rs:146:5: 146:29 = note: inside `::write` at /home/manishearth/.cargo/registry/src/index.crates.io-6f17d22bba15001f/t1ha-0.1.0/src/lib.rs:93:28: 93:48 = note: inside `::write_u32` at /home/manishearth/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/hash/mod.rs:380:9: 380:37 = note: inside `std::hash::impls::::hash::` at /home/manishearth/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/hash/mod.rs:812:21: 812:39 note: inside `hashmap::algorithms::compute_hash::` --> utils/zerovec/src/hashmap/algorithms.rs:33:5 | 33 | key.hash(&mut hasher); | ^^^^^^^^^^^^^^^^^^^^^ note: inside ` as std::iter::FromIterator<(u32, &str)>>::from_iter::>` ```
Manishearth commented 1 year ago

The hashmap one is in an external crate (t1ha) that hasn't been updated since 2019. It fails miri on its own, worth investigating separately. There's another VZV failure due to bad databake data, I'm just deleting the test for now (https://github.com/unicode-org/icu4x/pull/2147/files#r1227541004). Someone ought to regen that data correctly, and perhaps add another test that ensure the databaked data has the appropriate contents.

An interesting one is the failure in stress_test around VZV mutation. I believe this is because the from and to in the ptr::copy are overlapping, which is allowed, but potentially not allowed in the stacked borrows model.

``` error: Undefined Behavior: attempting a write access using <287716> at alloc119082[0x9], but that tag does not exist in the borrow stack for this location --> utils/zerovec/src/varzerovec/owned.rs:310:17 | 310 | ptr::copy(block.start, to, block.end.offset_from(block.start) as usize); | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | | | attempting a write access using <287716> at alloc119082[0x9], but that tag does not exist in the borrow stack for this location | this error occurs as part of an access at alloc119082[0x8..0xb] | = help: this indicates a potential bug in the program: it performed an invalid operation, but the Stacked Borrows rules it violated are still experimental = help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information help: <287716> was created by a SharedReadWrite retag at offsets [0x0..0x9] --> utils/zerovec/src/varzerovec/owned.rs:289:31 | 289 | let slice_range = self.entire_slice.as_mut_ptr_range(); | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ = note: BACKTRACE (of the first span): = note: inside `varzerovec::owned::VarZeroVecOwned::::shift::shift_bytes` at utils/zerovec/src/varzerovec/owned.rs:310:17: 310:88 ```
Manishearth commented 1 year ago

Most of what's left seems to be VZV DST stuff that I don't think can be handled until we have proper pointer metadata APIs.

There is one tricky one in VarULE::to_boxed():

```rust ---- src/map2d/map.rs - map2d::map::ZeroMap2d<'a,K0,K1,V>::remove (line 323) stdout ---- Test executable failed (exit status: 1). stderr: error: Undefined Behavior: trying to retag from <27203> for Unique permission at alloc10977[0x0], but that tag only grants SharedReadOnly permission for this location --> /home/manishearth/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/boxed.rs:995:9 | 995 | Box(unsafe { Unique::new_unchecked(raw) }, alloc) | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | | | trying to retag from <27203> for Unique permission at alloc10977[0x0], but that tag only grants SharedReadOnly permission for this location | this error occurs as part of retag at alloc10977[0x0..0x3] | = help: this indicates a potential bug in the program: it performed an invalid operation, but the Stacked Borrows rules it violated are still experimental = help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information help: <27203> was created by a SharedReadOnly retag at offsets [0x0..0x3] --> /home/manishearth/dev/icu4x/utils/zerovec/src/ule/mod.rs:364:17 | 364 | Self::from_byte_slice_unchecked(&bytesvec) as *const Self as *mut Self; | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ = note: BACKTRACE (of the first span): = note: inside `std::boxed::Box::::from_raw_in` at /home/manishearth/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/boxed.rs:995:9: 995:58 = note: inside `std::boxed::Box::::from_raw` at /home/manishearth/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/boxed.rs:939:18: 939:48 note: inside `::to_boxed` --> /home/manishearth/dev/icu4x/utils/zerovec/src/ule/mod.rs:367:13 | 367 | Box::from_raw(ptr) ```

The reason it's tricky is that fn from_byte_slice_unchecked(bytes: &[u8]) -> &Self definitely does not produce a reference you can turn into a Box, but we're only really relying on it to do pointer arithmetic for us. We could replace it with from_byte_slice_unchecked(bytes: *const [u8]) -> *const Self, though I'm not sure if it counts as a retag to turn that into a *mut first. I'd rather not maintain parallel *const and *mut versions of the same API.

Manishearth commented 1 year ago

https://github.com/rust-lang/unsafe-code-guidelines/issues/256 is the upstream issue for a couple of the DST twiddling violations we have seen. Looks like it's fixed under Tree Borrows.

Manishearth commented 1 year ago

Current progress:

failures:
    utils/zerovec/src/lib.rs - (line 92)
    utils/zerovec/src/lib.rs - make_varule (line 448)
    utils/zerovec/src/map/map.rs - map::map::ZeroMap<'a,K,V>::insert_var_v (line 333)
    utils/zerovec/src/map/map.rs - map::map::ZeroMap<'a,K,V>::remove (line 228)
    utils/zerovec/src/map2d/map.rs - map2d::map::ZeroMap2d<'a,K0,K1,V>::remove (line 323)

(minus the hashmap issues from t1ha)

First failure ``` test map2d::map::test::stress_test ... error: Undefined Behavior: trying to retag from <3286187> for Unique permission at alloc1244882[0x0], but that tag only grants SharedReadOnly permission for this location --> /home/manishearth/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/boxed.rs:995:9 | 995 | Box(unsafe { Unique::new_unchecked(raw) }, alloc) | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | | | trying to retag from <3286187> for Unique permission at alloc1244882[0x0], but that tag only grants SharedReadOnly permission for this location | this error occurs as part of retag at alloc1244882[0x0..0x3] | = help: this indicates a potential bug in the program: it performed an invalid operation, but the Stacked Borrows rules it violated are still experimental = help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information help: <3286187> was created by a SharedReadOnly retag at offsets [0x0..0x3] --> utils/zerovec/src/ule/mod.rs:364:17 | 364 | Self::from_byte_slice_unchecked(&bytesvec) as *const Self as *mut Self; | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ = note: BACKTRACE (of the first span): = note: inside `std::boxed::Box::::from_raw_in` at /home/manishearth/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/boxed.rs:995:9: 995:58 = note: inside `std::boxed::Box::::from_raw` at /home/manishearth/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/boxed.rs:939:18: 939:48 note: inside `::to_boxed` --> utils/zerovec/src/ule/mod.rs:367:13 | 367 | Box::from_raw(ptr) | ^^^^^^^^^^^^^^^^^^ note: inside ` as map::vecs::MutableZeroVecLike<'_, str>>::zvl_remove` --> utils/zerovec/src/map/vecs.rs:470:19 | 470 | let old = vec.get(index).unwrap().to_boxed(); | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ note: inside `map2d::map::ZeroMap2d::<'_, u16, str, str>::remove` --> utils/zerovec/src/map2d/map.rs:349:9 | 349 | self.keys1.zvl_remove(index); | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ note: inside `map2d::map::test::stress_test` --> utils/zerovec/src/map2d/map.rs:816:22 | 816 | let result = zm2d.remove(&3, "ccc"); // first element | ^^^^^^^^^^^^^^^^^^^^^^ note: inside closure --> utils/zerovec/src/map2d/map.rs:709:22 | 708 | #[test] | ------- in this procedural macro expansion 709 | fn stress_test() { | ^ = note: this error originates in the attribute macro `test` (in Nightly builds, run with -Z macro-backtrace for more info) note: some details are omitted, run with `MIRIFLAGS=-Zmiri-backtrace=full` for a verbose backtrace ```