-
cc @laurae2
On benchmark on dual Xeon Gold 6154 vs MKL:
```
Warmup: 0.9943 s, result 224 (displayed to avoid compiler optimizing warmup away)
A matrix shape: (M: 2304, N: 2304)
B matrix sha…
-
## Is your feature request related to a problem? Please describe.
It would be great if scipy's sparse matrix multiplication was multithreaded. It is the bottleneck in many computations in [`scanpy`…
-
In scientific papers using matrices, you'll often see notation where block matrices consist of blocks such as **A** (a full matrix), but also **I** (an appropriately sized identity matrix), **0** and …
-
While studying autoencoder architecture, I discovered that the similar terms "transposed convolution" and "deconvolution" have caused some confusion. I would like to clarify their differences and expl…
-
Hello! It's me again!
I've been optimizing `eant2`, mostly just minor stuff like amortizing allocations and increasing cache hits, parallelism, etc.
The real meat of the work of eant2 is done i…
-
Under the `Interpreting a linear classifier` section, the `ship score` in the included graphic should be `60.75` instead of `61.95`.
Shown below are the simplification steps (it appears that the bi…
-
I am encountering a segmentation fault while running a DGEMM program with AMD BLIS. I am using the Flang compiler with AOCC and linking against the BLIS library. The program fails during the DGEMM op…
-
Following [a blog post on using Apple's BLAS library](https://blogs.mathworks.com/matlab/2023/12/13/life-in-the-fast-lane-making-matlab-even-faster-on-apple-silicon-with-apple-accelerate/) you can swi…
-
Does PyRate (and BDNN specifically) have GPU capability?
-
### Bug description
I tried to compare my matrix multiplication implementation with that of the same python implementation and the algorithm worked for smaller matrices but it crashed for bigger ones…