-
For Livermore Kernel 7, clang don't generate vectorization as too complex arithmetic computation of the inner loop. https://godbolt.org/z/hr7YGoEPq
```
void kernel_7(void)
{
for (long l=1 ; …
vfdff updated
9 months ago
-
```
What version of the product are you using? On what operating system?
Aparapi on Ubuntu 12.04
Please provide any additional information below.
I am looking into the possibility of using the Aparap…
-
If the CPU supports vector instructions (for example, mmx) we should use them to perform operations faster.
-
It would be good to include an introduction to vectorization, especially with respect to R's random number generator functions. Right now we introduce it tangentially in discussing `rnorm()` but might…
-
Currently the vectorize raster function (`ST_DumpAsPolygon`) is taking around 20-30 seconds to process a 1.6MB raster.
It's all being done one 1 CPU. Possibly there's a way to cut this job into piec…
-
As suggested in this comment: https://github.com/NVIDIA/Fuser/pull/2105#discussion_r1592632950
Async loads can only be used for vectorization 4, 8, or 16. If either operand only support vectorizati…
-
Since we are comparing the value to a "window" of previous values during compression, I believe we may benefit from vectorizing the code - compare the value to multiple values concurrently using vecto…
-
Currently working on issue https://bugzilla.mozilla.org/show_bug.cgi?id=1887312 . I discovered inefficient/high-cost (from WebAssembly compilation point of view) shuffles. These shuffles generate more…
-
```
What version of the product are you using? On what operating system?
Aparapi on Ubuntu 12.04
Please provide any additional information below.
I am looking into the possibility of using the Aparap…
-
```
What version of the product are you using? On what operating system?
Aparapi on Ubuntu 12.04
Please provide any additional information below.
I am looking into the possibility of using the Aparap…