-
When using code generator with arguments `l=1` and `d=1` `simint_ostei_worksize(1,1)` returns `59840`,
while using code generator with arguments `l=2` and `d=1` `simint_ostei_worksize(1,1)` returns `…
-
We're not currently configuring axom to use architecture specific flags in our CPU builds.
This could improve performance for release builds by using SIMD and compiler intrinsics like `popCount`
We s…
-
After the recent CG meeting (TODO: link notes once uploaded), we need to pin down how we're advancing the intertwined [Relaxed SIMD](https://github.com/WebAssembly/relaxed-simd) and [Profiles](https:/…
-
I'd love to merge the GVN branch from @ajvondrak, but I'm having a few test failures (focusing only on `resource:core` and `resource:basis` at the moment):
```
$ [gvn*] rlwrap ./factor
IN: scratchpad…
-
I tried this code:
```rust
#![feature(portable_simd)]
use std::simd::{num::SimdFloat, Simd};
#[inline(never)]
pub fn px(xx: [Simd; 8], ax: &[Simd], bx: &[Simd], cx: &[Simd]) {
for a in a…
-
Arm (aarch64) cpus are becoming more popular/powerful (e.g. AWS Graviton instances) ... does Tantivy also specialize SIMD for Arm CPUs?
-
```
libs/QPhiX/include/qphix/blas_new_c.h(893): warning #15552: loop was not vectorized with "simd"
libs/QPhiX/include/qphix/blas_new_c.h(893): warning #15552: loop was not vectorized with "simd"
l…
-
Hi I have been trying to install BPCells unsuccessfully for some time now, I am unable to figure out the exact root cause of the issue. The error seems to stem from some issues with the loaded hdf5 li…
-
We have observed unexpected loss of vectorization effort by the compiler when compiling the code below. Target was amd64 machine with AVX2 instruction set supported.
```rust
type T = u32;
#[i…
-
mentioned in #13 this would be a useful operation to have for (float) operations in cfavml itself. for subnormal or large numbers, computing it directly will lead to underflow or overflow. I'm in the …