#![allow(unused_variables)]
#[allow(non_camel_case_types)]
#[derive(Clone, Copy)]
pub struct f64x2(f64,f64);
const ZEROS: [f64x2;4] = [f64x2(0.0,0.0); 4];
#[inline(always)]
fn mand8(init_r: &[f64x2;4], init_i : f64x2) -> u8 { 0 }
pub fn mand64(init_r: &[f64x2;32], init_i : f64x2, out : &mut [u8]) {
let mut tmp_init_r = ZEROS;
for i in 0..8 {
tmp_init_r.copy_from_slice(&init_r[4*i..4*i+4]);
out[i] = mand8(&tmp_init_r, init_i);
}
}
fn main() {
/*
...
let rows : Vec<_> = if width%64==0 {
(0..width).into_par_iter().map(|y|{
let mut tmp_r0 = [f64x2(0.0,0.0);32];
let mut row = vec![0 as u8; (width/8) as usize];
let init_i = f64x2(i0[y], i0[y]);
for x in 0..width/64{
tmp_r0.copy_from_slice(&r0[32*x..32*x+32]);
mand64(&tmp_r0, init_i, &mut row[8*x..8*x + 8]);
}
row
}).collect()
}
...
*/
}
If you take a look at the asm of mand64 you see a core::panicking::panic_bounds_check.
If you replace the slicing of the main &mut row[8*x..8*x + 8] with a slice::array_chunks_mut, the panic vanishes and the code gets optimized better (and the benchmark gets a little faster). So I suggest to detect situations like that &row[8*x..8*x + 8] and suggest to use array_chunks/array_chunks_mut instead.
As a bonus Clippy could detect such case even if it's not inside a loop, in that case it could just suggest a myslice[8*i..8*i + 8].try_into().unrap() at call point. But I suspect this lint is going to have more false positives compared to the array_chunks/array_chunks_mut cases.
In this benchmark: https://benchmarksgame-team.pages.debian.net/benchmarksgame/program/mandelbrot-rust-6.html
There's code like:
If you take a look at the asm of
mand64
you see acore::panicking::panic_bounds_check
.If you replace the slicing of the main
&mut row[8*x..8*x + 8]
with aslice::array_chunks_mut
, the panic vanishes and the code gets optimized better (and the benchmark gets a little faster). So I suggest to detect situations like that&row[8*x..8*x + 8]
and suggest to usearray_chunks/array_chunks_mut
instead.As a bonus Clippy could detect such case even if it's not inside a loop, in that case it could just suggest a
myslice[8*i..8*i + 8].try_into().unrap()
at call point. But I suspect this lint is going to have more false positives compared to thearray_chunks/array_chunks_mut
cases.