-
Sometimes the compiler heuristic with `pragma ivdep` does not get us the desired vectorization. There should be some way to opt in to more aggressive vectorization via `pragma omp simd` for the `OpenM…
-
If we don't set this option, we could miss some vectorization/SIMD optimizations in our Numba conversions.
-
Row reduction on numeric types (as opposed to symbolic `Expression`s) is fairly SIMD friendly. However, when I've profiled this, `RowReduce` doesn't seem to take a lot of time. The other big row reduc…
-
### 🐛 Describe the bug
new_perf_regression in 2024-02-11
| suite | name | batch_size_new | speed_up_new | inductor_new | eager_new | compilation_latency_new | batch_size_old | speed_up_old | induc…
-
### Search before asking
- [X] I had searched in the [issues](https://github.com/apache/skywalking/issues?q=is%3Aissue) and found no similar feature requirement.
### Description
Vectorization has …
-
**Describe the bug**
I cannot understand why min and max have different vectorization macros (one has `vector_disabled`, while the other has `vector_simd`). It appears to me as a potential bug.
**…
-
Now that SIMD intrinsics for x86 have been stabilized, it might be worthwhile to add explicit SIMD to accelerate unmasking. For example, autobahn-python [uses](https://github.com/crossbario/autobahn-p…
-
Performance of the random ray solver in OpenMC is highly sensitive to the performance of the flux attenuation kernel that forms the inner loop of the simulation. This inner loop is responsible for per…
-
## Statement
I found a C++ code pattern missed optimization after #84628 that is widely used in [Verilator](https://github.com/verilator/verilator.git) generated C++ codes which consume [CIRCT](htt…
-