Closed JMMackenzie closed 2 years ago
Note that this is important for when we allow weighted queries; Currently we only allow i16
as our accumulator type, and this will lead to overflows.
It's better to use u16
as the default and u32
if necessary perhaps.
Unfortunately some of the SIMD intrinsics seem to require signed 16's; I'm not too sure if we can do the same thing with unsigned values.
@mpetri pointed out: https://doc.rust-lang.org/nightly/core/arch/x86_64/fn._mm256_max_epu16.html
So we should be able to port the current setup to u16's at least.
Need to check whether the fast SIMD blocking will work, or if we need a different method of collecting the top-k.