JuliaLinearAlgebra / Octavian.jl

Multi-threaded BLAS-like library that provides pure Julia matrix multiplication
https://julialinearalgebra.github.io/Octavian.jl/stable/
Other
226 stars 18 forks source link

Analogous for `LinearAlgebra.mul!(C, A, B, α, β)` #144

Closed fipelle closed 2 years ago

fipelle commented 2 years ago

Hi there, I would love to implement Octavian.jl in a few packages of mine (public and private). However, I am making a rather large use of the 5-argument LinearAlgebra.mul!(C, A, B, α, β). I was hoping that matmul! could be enhanced to support the same type of operation. It should be an easy addition, but a quite useful one for casual users like myself.

chriselrod commented 2 years ago

Did you encounter any problems with the current implementation?

help?> Octavian.matmul!
  matmul!(C, A, B[, α, β, max_threads])

  Calculates C = α * A * B + β * C in place, overwriting the contents of C. It may use up to max_threads threads. It will not
  use threads when nested in other threaded code.

Also, the C .*= should be before the matmul!, e.g.:

function matmul!(C, A, B, α, β)
    C .*= β;
    matmul!(C, α .* A, B);
end

The current implementation of course fuses over the scaling with the matmul itself to avoid any extra passes over the arrays.

fipelle commented 2 years ago

Thanks, I was actually editing my original comment to change the order of C .*= as you did (I removed it now). Frankly, I didn't know there was one. I just looked into the documentation and the only option I could find under Multi-threaded matrix multiplication: matmul! and matmul was the one with three arguments.

EDIT

I have just noticed it is described here. My bad!

chriselrod commented 2 years ago

Thanks, I was actually editing my original comment to change the order of C .*= as you did (I removed it now). Frankly, I didn't know there was one. I just looked into the documentation and the only option I could find under Multi-threaded matrix multiplication: matmul! and matmul was the one with three arguments! My bad.

It is in the docstring, but a PR to improve the documentation would be welcome. =)

fipelle commented 2 years ago

Thanks! However, I am going to implement Octavian for MessyTimeSeries.jl and MessyTimeSeriesOptim.jl right now!

I will do a PR :)