Open morrisonlevi opened 4 months ago
Edit: this comment is out-of-date with the current implementation, left because it's interesting.
I've figured out the rough equation to do it without a loop:
fn maximum_buckets_in(allocation_size: usize, table_layout: TableLayout) -> usize {
// Given an equation like:
// z >= x * y + x
// x can be maximized by doing:
// x = z / (y + 1)
// If you squint:
// z is the size of the allocation
// y is the table_layout.size
// x is the number of buckets
// But there are details like x needing to be a power of 2,
// and there are some extra bytes mixed in (a possible
// rounding up for table_layout.align, and Group::WIDTH).
/// todo: how do I factor in the ctrl_align?
let z = allocation_size - Group::WIDTH;
let y_plus_1 = table_layout.size + 1;
prev_pow2(z / y_plus_1)
}
I'm not quite sure about the table_layout.ctrl_align
. I need to think about that more. It seems like it can be ignored, but I haven't quite figured out why or the proof for it.
Edit: I tried to find a case where ignoring the ctrl_align
caused a problem programmatically:
type T = (bool, ());
let table_layout = TableLayout::new::<T>();
let begin = {
// there are never less than 4 buckets
let (layout, _) = table_layout.calculate_layout_for(4).unwrap();
layout.size()
};
use rayon::prelude::*;
(begin..=(1 << 47))
.into_par_iter()
.for_each(|allocation_size| {
let buckets = maximum_buckets_in(allocation_size, table_layout).unwrap();
let (layout, _) = table_layout.calculate_layout_for(buckets).unwrap();
let size = layout.size();
assert!(
size <= allocation_size,
"failed {size} <= {allocation_size}"
);
});
I ran it for quite some time with different TableLayout
s. No issues on any of them. I think it has to do with the relationship between rounding down to the previous power of 2, and the fact the rounding for align is always very small, in the range 0..ctrl_align.
I found an issue with the previous implementation with very small TableLayout
size
s with large ctrl_align
s. The optimization in #524 will correct the domain of maximum_buckets_in
because buckets * table_layout.size
(or x * y
above) will now be at least ctrl_align
, so we can safely ignore it in the equation.
Allocators are allowed to return a larger memory chunk than was asked for. If the amount extra is large enough, then the hash map can use the extra space. The Global allocator will not hit this path, because it won't over-size enough to matter, but custom allocators may. An example of an allocator which allocates full system pages is included in the test suite (Unix only because it uses
mmap
).This implements #489. This relies on PR #524 to increase the minimum number of buckets for certain small types, which in turn constrains the domain of
maximum_buckets_in
so that the alignment can be ignored.I haven't done any performance testing. Since this is on the slow path of making a new allocation, the feature should be doable without too much concern about overhead.
This is my first contribution to the project, and I am definitely not an expert in swiss tables. Feedback is very welcome, even nitpicking.