-
@gbaraldi requested a writeup of the design I had for vectorized primitives.
I have no particular plans to implement this myself anytime soon, so this is
up for grabs.
# Background
A longstanding (…
-
In README, there is a narrative in Performance section.
"
_-O3 includes optimizations that are expensive in terms of compile time and memory usage. Including vectorization, loop unrolling, and predi…
-
Consider two forms of writing fabs() math function in C:
```C
#include
double fabs_pointer(double v)
{
uint64_t u = *(uint64_t*)&v & 0x7FFFFFFFFFFFFFFF;
return *(double*)&u;
}
dou…
-
Is it possible that we can have a GCC/LLVM compatible compile option to specifiy LMUL in auto-vectorization?
For example, -mriscv-vector-lmul or -mrvv-vector-lmul ?
Thanks.
-
| | |
|--------------------|----|
| Bugzilla Link | [PR24998](https://bugs.llvm.org/show_bug.cgi?id=24998) |
| Status | NEW |
| Importance | P normal |
|…
-
I am looking into vectorization for AVX512 in a case where the loop trip count isn't a multiple of the ideal VF of 16.
A simplified version of the problem looks like this[1]:
```
void foosum56(fl…
-
Numba is 0.58. llvmlite is 0.41.0. Ubuntu 22.04 on i7-12700K.
Consider the following script:
```python
from timeit import timeit
from numba import njit
import numpy as np
ARR = np.ones((1_…
-
I tried this code:
```rust
const MAX_STEP_DATA_SIZE: usize = 32;
#[inline]
fn check(byte: u8) -> bool {
return (byte bool {
let mut step_data = [false; MAX_STEP_DATA_SIZE / std:…
-
Basically I want to use auto vectorrization with t-head c906 processor and I am using Sipeed M1s Dock hardware. I have compiled the toolchain with standard vector ISA enabled `--with-arch=rv64gcv` and…
-
# Summary
I observed that assigning or copying vector integer elements via STL algorithm with changed bit width does not engage vectorization, whereas manually-written index-based loop is vectoriz…