JuliaLinearAlgebra / libblastrampoline

Using PLT trampolines to provide a BLAS and LAPACK demuxing library.
MIT License
66 stars 17 forks source link

Detect interface type for LAPACK #35

Closed ViralBShah closed 1 year ago

ViralBShah commented 3 years ago

This is using the LAPACK that is registered in Yggdrasil. I suppose we only have a BLAS test for detecting the interface type, and we should probably do it with an LAPACK function as well.

We need to do this so that we can load BLAS and LAPACK separately (as the case will be if we use a BLIS provided BLAS). Currently we have only tested OpenBLAS and MKL, both of which ship both BLAS and LAPACK.

julia> using LAPACK_jll

julia> BLAS.lbt_forward(LAPACK_jll.liblapack)
Error: no BLAS/LAPACK library loaded!
Unable to autodetect interface type of "/Users/viral/.julia/artifacts/db87d23fe12e550a2852beb75a9814b73fc91424/lib/liblapack.3.9.0.dylib"
0
ViralBShah commented 2 years ago

We probably need to fix this so that we can use Accelerate on macOS, essential for M1.

ViralBShah commented 2 years ago

Here's on MacOS with BLAS forwards for Accelerate:

julia> BLAS.lbt_forward(LAPACK_jll.liblapack, clear=false, verbose=true)
Generating forwards to /Users/viral/.julia/artifacts/54d54e7b7749e3e8c5af5ad9c346325220c9d553/lib/liblapack.3.10.0.dylib
 -> Autodetected symbol suffix ""
 -> Autodetected interface LP64 (32-bit)
Unable to autodetect complex return style of "/Users/viral/.julia/artifacts/54d54e7b7749e3e8c5af5ad9c346325220c9d553/lib/liblapack.3.10.0.dylib"
0

Will Apple BLAS + Yggdrasil built LAPACK work together? cc @giordano @staticfloat

staticfloat commented 2 years ago

You need to load Apple Accelerate first:

On Apple Silicon (which has no complex return style problems)

julia> using LinearAlgebra, LAPACK_jll
       BLAS.lbt_forward("/System/Library/Frameworks/Accelerate.framework/Accelerate"; verbose=true, clear=true)
       BLAS.lbt_forward(LAPACK_jll.liblapack_path; verbose=true)
Generating forwards to /System/Library/Frameworks/Accelerate.framework/Accelerate
 -> Autodetected symbol suffix ""
 -> Autodetected interface LP64 (32-bit)
Processed 4945 symbols; forwarded 1705 symbols with 32-bit interface and mangling to a suffix of ""
Generating forwards to /Users/sabae/.julia/artifacts/65c65bc8413bbca96d1d988b65cdae3d9a64cedb/lib/liblapack.3.10.0.dylib
 -> Autodetected symbol suffix ""
 -> Autodetected interface LP64 (32-bit)
Processed 4945 symbols; forwarded 4945 symbols with 32-bit interface and mangling to a suffix of ""
4945

On x86_64 (which does have complex return style problems)

julia> using LinearAlgebra, LAPACK_jll
       BLAS.lbt_forward("/System/Library/Frameworks/Accelerate.framework/Accelerate"; verbose=true, clear=true)
       BLAS.lbt_forward(LAPACK_jll.liblapack_path; verbose=true)
Generating forwards to /System/Library/Frameworks/Accelerate.framework/Accelerate
 -> Autodetected symbol suffix ""
 -> Autodetected interface LP64 (32-bit)
 -> Autodetected f2c-style calling convention
 - [2732] f2c(cdotc_)
 - [2733] f2c(cdotu_)
 - [3856] f2c(sasum_)
 - [3872] f2c(scasum_)
 - [3873] f2c(scnrm2_)
 - [3879] f2c(sdot_)
 - [3880] f2c(sdsdot_)
 - [4026] f2c(slamc3_)
 - [4027] f2c(slamch_)
 - [4153] f2c(snrm2_)
 - [4436] f2c(zdotc_)
 - [4437] f2c(zdotu_)
Processed 4945 symbols; forwarded 1705 symbols with 32-bit interface and mangling to a suffix of ""
Generating forwards to /Users/julia/.julia/artifacts/054602666af60d1f1b093c1edcf466c7f86e6d8f/lib/liblapack.3.9.0.dylib
 -> Autodetected symbol suffix ""
 -> Autodetected interface LP64 (32-bit)
 -> Autodetected gfortran calling convention
Processed 4945 symbols; forwarded 4945 symbols with 32-bit interface and mangling to a suffix of ""
4945
ViralBShah commented 2 years ago

Is it possible that LAPACK on the same platform (arm macos) has different settings (on complex return type and such) than what Apple ships in Accelerate?

ViralBShah commented 2 years ago

Doesn't work for me.

               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.9.0-DEV.1094 (2022-08-07)
 _/ |\__'_|_|_|\__'_|  |  vs/sparsesysimage/70f74fe60d (fork: 1 commits, 0 days)
|__/                   |

julia> using LinearAlgebra, LAPACK_jll

julia> BLAS.lbt_forward("/System/Library/Frameworks/Accelerate.framework/Accelerate"; verbose=true, clear=true)
Generating forwards to /System/Library/Frameworks/Accelerate.framework/Accelerate
 -> Autodetected symbol suffix ""
 -> Autodetected interface LP64 (32-bit)
 -> Autodetected argument-passing complex return style
 -> Autodetected f2c-style calling convention
 - [2732] complex(cdotc_)
 - [2732] f2c(cdotc_)
 - [2733] complex(cdotu_)
 - [2733] f2c(cdotu_)
 - [3856] f2c(sasum_)
 - [3872] f2c(scasum_)
 - [3873] f2c(scnrm2_)
 - [3879] f2c(sdot_)
 - [3880] f2c(sdsdot_)
 - [4026] f2c(slamc3_)
 - [4027] f2c(slamch_)
 - [4153] f2c(snrm2_)
 - [4436] complex(zdotc_)
 - [4436] f2c(zdotc_)
 - [4437] complex(zdotu_)
 - [4437] f2c(zdotu_)
Processed 4945 symbols; forwarded 1705 symbols with 32-bit interface and mangling to a suffix of ""
1705

julia> BLAS.lbt_forward(LAPACK_jll.liblapack_path; verbose=true)
Generating forwards to /Users/viral/.julia/artifacts/54d54e7b7749e3e8c5af5ad9c346325220c9d553/lib/liblapack.3.10.0.dylib
 -> Autodetected symbol suffix ""
 -> Autodetected interface LP64 (32-bit)
Unable to autodetect complex return style of "/Users/viral/.julia/artifacts/54d54e7b7749e3e8c5af5ad9c346325220c9d553/lib/liblapack.3.10.0.dylib"
0
ViralBShah commented 2 years ago

With #82 this works. Is it a problem that the different libraries (BLAS From Accelerate and LAPACK from Ygg) have different fortran calling conventions and different complex return style?

julia> BLAS.lbt_forward("/System/Library/Frameworks/Accelerate.framework/Accelerate"; verbose=true, clear=true)
Generating forwards to /System/Library/Frameworks/Accelerate.framework/Accelerate
-> Autodetected symbol suffix ""
-> Autodetected interface LP64 (32-bit)
-> Autodetected argument-passing complex return style
-> Autodetected f2c-style calling convention
- [2732] complex(cdotc_)
- [2732] f2c(cdotc_)
- [2733] complex(cdotu_)
- [2733] f2c(cdotu_)
julia> BLAS.lbt_forward(LAPACK_jll.liblapack_path; verbose=true)
Generating forwards to /Users/viral/.julia/artifacts/54d54e7b7749e3e8c5af5ad9c346325220c9d553/lib/liblapack.3.10.0.dylib
 -> Autodetected symbol suffix ""
 -> Autodetected interface LP64 (32-bit)
 -> Autodetected normal complex return style
 -> Autodetected gfortran calling convention
Processed 4945 symbols; forwarded 4945 symbols with 32-bit interface and mangling to a suffix of ""
ViralBShah commented 2 years ago

Doesn't seem to work though. Just after doing the BLAS forwards to Accelerate, and trying to run peakflops:

Processed 4945 symbols; forwarded 1705 symbols with 32-bit interface and mangling to a suffix of ""
1705

julia> peakflops()
Error: no BLAS/LAPACK library loaded!
Error: no BLAS/LAPACK library loaded!
ERROR: AssertionError: a2[1, 1] == n
staticfloat commented 2 years ago

Is it a problem that the different libraries (BLAS From Accelerate and LAPACK from Ygg) have different fortran calling conventions and different complex return style?

I don't think it should be a problem, as Accelerate should call its own symbols internally. We expose a "gfortran" calling convention and "normal" complex return style, so LBT will automatically translate Accelerate to conform to that, allowing LAPACK to call Accelerate without problems. If we had a library that called into LBT expecting a non-gfortran calling convention, that would be a problem.

Doesn't seem to work though:

Your Julia is trying to call ILP64 symbols which you cleared out. Julia will never use Accelerate, since Accelerate is LP64-only. You would need to build Julia to use LP64 instead.

ViralBShah commented 2 years ago

I see - with MKL, we have ILP64, and we just forward the symbols. I suppose if we had a solution to https://github.com/JuliaLang/julia/issues/43304, we could use Accelerate seamlessly. Otherwise, it is unlikely we will.

ViralBShah commented 2 years ago

@perrutquist I guess this is where SetBlasInt.jl could come in handy. In theory, we now have a full LP64 Accelerate BLAS + LAPACK that we can use through SetBlasInt on the mac. It's all a bit of a hack, but should probably work.

ViralBShah commented 1 year ago

I think we are ok without this. The original reason was Apple's BLAS, but we have ILP64 on macOS now: https://discourse.julialang.org/t/appleaccelerate-jl-v0-4-0/99351