JuliaLinearAlgebra / Octavian.jl

Multi-threaded BLAS-like library that provides pure Julia matrix multiplication
https://julialinearalgebra.github.io/Octavian.jl/stable/
Other
226 stars 18 forks source link

Compatibility with HyperDualNumbers and StrideArrays #178

Closed CheukHinHoJerry closed 1 year ago

CheukHinHoJerry commented 1 year ago

I am work on a performance-demanding code with HyperDualNumbers and PtrArray but I got an error with mul!. This is a MWE reproducing the error:

using LinearAlgebra
using HyperDualNumbers: Hyper
using StrideArrays

AA = rand(Float64, (1000, 10))
hAA = Hyper.(AA)

hAA = StrideArrays.PtrArray(hAA)
W = rand(10, 5)

AA * W
hAA * W
ERROR: TypeError: in typeassert, expected Int64, got a value of type Nothing

It looks like a bug and it seems that Octavian is analyzing the size of array elements and how it should do with the matrix multiplication. The standard Julia implementation with Arrays uses a naive fall-back instead of Octavian.

chriselrod commented 1 year ago

You could add a hyper dual extension, similar to the forward diff extension: https://github.com/JuliaLinearAlgebra/Octavian.jl/blob/master/ext/ForwardDiffExt.jl

The standard Julia implementation with Arrays uses a naive fall-back

Unfortunately, that naive fallback is extremely bad. A truly naive triple loop implementation is much better, at least for the small array sizes I've tried.

CheukHinHoJerry commented 1 year ago

Thank you. I got this fixed.

chriselrod commented 1 year ago

Octavian won't actually work without an extension. So I'm not sure what your fix is. Perhaps you're just running another fallback?

CheukHinHoJerry commented 1 year ago

I wrote a similar extension and use it to overload functions in my package. Do you think it is a proper fix?

chriselrod commented 1 year ago

Okay, yes, that sounds good. You're also welcome to contribute such an extension here, if you'd like.

CheukHinHoJerry commented 1 year ago

Sure. It is almost the same as what you already have for ForwardDiff. I will put that in a separate PR in case it needs further fixes and tests.