Open Fuuzetsu opened 4 months ago
A panic is usually preferable to a plain branch because all branches leading to a panic are marked as unlikely, and the panic handler is outlined so it's not on the hot path at all, not taking up space in icache etc.
I think this could be considered if benchmarks show that it is beneficial, but I very much doubt it will be possible to do better than a panic here for the aforementioned reasons. assert!
can create non-outlined code, but indexing panics should be fully outlined and have the equivalent of #[cold]
on them.
I think this could be considered if benchmarks show that it is beneficial, but I very much doubt it will be possible to do better than a panic here for the aforementioned reasons.
Can't we just do this then?
fn deref(&self) -> &Self::Target {
let len = self.len as usize;
let s = self.data.as_slice();
#[inline]
#[cold]
fn cold() {}
if len > s.len() {
cold();
s
} else {
&s[..len]
}
}
I guess I can make a change and check on some code I have that it makes any difference at all. But I expect that it does, as per the other ticket I made.
The code under benchmark is:
pub fn get(&self, dir: Dir, p: Price) -> Qty {
self.pqs[dir]
.iter()
.find_map(|PqPair { price, qty }| if p == *price { Some(*qty) } else { None })
.unwrap_or(0)
}
For details on this, see https://github.com/Lokathor/tinyvec/issues/193#issuecomment-2136493274 ; but if preferred/necessary, I can cook something similar that could maybe go into tinyvec
repository.
I attach llvm-mca
outputs before and after the change. It's clear that in a loop, it performs much better.
The only change between the two is this, in tinyvec
diff --git a/src/arrayvec.rs b/src/arrayvec.rs
index 52aeeb1..ab370e3 100644
--- a/src/arrayvec.rs
+++ b/src/arrayvec.rs
@@ -49,6 +49,11 @@ macro_rules! array_vec {
};
}
+#[inline]
+#[cold]
+/// Used to mark some braches as unlikely.
+fn cold() {}
+
/// An array-backed, vector-like data structure.
///
/// * `ArrayVec` has a fixed capacity, equal to the minimum of the array size
@@ -157,7 +162,14 @@ impl<A: Array> Deref for ArrayVec<A> {
#[inline(always)]
#[must_use]
fn deref(&self) -> &Self::Target {
- &self.data.as_slice()[..self.len as usize]
+ let len = self.len as usize;
+ let s = self.data.as_slice();
+ if len > s.len() {
+ cold();
+ s
+ } else {
+ &s[..len]
+ }
}
}
@@ -165,7 +177,14 @@ impl<A: Array> DerefMut for ArrayVec<A> {
#[inline(always)]
#[must_use]
fn deref_mut(&mut self) -> &mut Self::Target {
- &mut self.data.as_slice_mut()[..self.len as usize]
+ let len = self.len as usize;
+ let s = self.data.as_slice_mut();
+ if len > s.len() {
+ cold();
+ s
+ } else {
+ &mut s[..len]
+ }
}
}
For completeness, I attach benchmark results for same suite as I had in #193.
I'm not sure why it's mostly better except for the few ones. I guess I should probably put something together that others can replicate first.
Either way, it certainly doesn't look like nothing is happening, right?
There's a lot of code similar to this, on
ArrayVec
:We invoke something like this every single time we try to get a slice, directly or indirectly (such as via .iter()).
In world of
unsafe
implementations, one would normally maintain the invariant thatlen <= self.data.len()
and unsafely create the slice. This is against whattinyvec
tries to achieve. Sadly, the trade-off is that this will basically always insertpanic
into the generated code.In practice, this panic can only happen if there's a bug in
ArrayVec
(forgetting to update len properly, missed assumption) or user does something sketchy like unsafely modify the field.There is a different option however: change the function to look more like
This (presumably) compiles to code that doesn't have a panic and it behaves the same as usual in non-weird code. The only change in behaviour is that instead of crashing on ArrayVec bug or due to some unsafe user code (when
self.len
grows past capacity), it'll now give the full capacity. For all currently-working code it stays working, but probably faster (not slower, for sure...). Any currently-broken code is presumably already crashing, if such code even exists.We can recover slight bit of previous behaviour with debug asserts I suppose...