Closed staticfloat closed 1 year ago
This will naturally fail CI on any macOS older than 13.3
Anecdotally, Accelerate on my M1 Pro runs the LinearAlgebra test suite pretty quickly:
Running parallel tests with:
nworkers() = 8
nthreads() = 1
Sys.CPU_THREADS = 8
Sys.total_memory() = 16.000 GiB
Sys.free_memory() = 1.479 GiB
Test (Worker) | Time (s) | GC (s) | GC % | Alloc (MB) | RSS (MB)
LinearAlgebra/bidiag (9) | started at 2023-04-12T12:13:15.792
LinearAlgebra/diagonal (7) | started at 2023-04-12T12:13:15.839
LinearAlgebra/special (8) | started at 2023-04-12T12:13:15.883
LinearAlgebra/symmetric (6) | started at 2023-04-12T12:13:15.883
LinearAlgebra/triangular (3) | started at 2023-04-12T12:13:15.884
LinearAlgebra/addmul (2) | started at 2023-04-12T12:13:15.884
LinearAlgebra/matmul (4) | started at 2023-04-12T12:13:15.884
LinearAlgebra/dense (5) | started at 2023-04-12T12:13:15.884
LinearAlgebra/special (8) | 97.45 | 2.87 | 2.9 | 13718.53 | 875.08
LinearAlgebra/qr (8) | started at 2023-04-12T12:14:53.531
LinearAlgebra/bidiag (9) | 106.62 | 3.55 | 3.3 | 13216.62 | 1039.58
LinearAlgebra/cholesky (9) | started at 2023-04-12T12:15:02.584
LinearAlgebra/dense (5) | 142.17 | 5.91 | 4.2 | 17388.80 | 1441.88
LinearAlgebra/blas (5) | started at 2023-04-12T12:15:38.173
LinearAlgebra/diagonal (7) | 147.45 | 6.44 | 4.4 | 18475.31 | 1209.39
LinearAlgebra/lu (7) | started at 2023-04-12T12:15:43.451
LinearAlgebra/qr (8) | 54.92 | 2.56 | 4.7 | 6552.89 | 945.16
LinearAlgebra/uniformscaling (8) | started at 2023-04-12T12:15:48.463
LinearAlgebra/cholesky (9) | 54.48 | 3.47 | 6.4 | 5250.13 | 1039.58
LinearAlgebra/structuredbroadcast (9) | started at 2023-04-12T12:15:57.069
LinearAlgebra/addmul (2) | 163.83 | 4.79 | 2.9 | 15629.88 | 625.73
LinearAlgebra/hessenberg (2) | started at 2023-04-12T12:15:59.829
LinearAlgebra/symmetric (6) | 169.04 | 6.57 | 3.9 | 18915.86 | 1102.09
LinearAlgebra/svd (6) | started at 2023-04-12T12:16:05.030
LinearAlgebra/matmul (4) | 175.18 | 6.65 | 3.8 | 21358.36 | 831.70
LinearAlgebra/eigen (4) | started at 2023-04-12T12:16:11.197
LinearAlgebra/blas (5) | 33.83 | 2.18 | 6.4 | 2384.33 | 1441.88
LinearAlgebra/tridiag (5) | started at 2023-04-12T12:16:12.010
LinearAlgebra/structuredbroadcast (9) | 31.07 | 3.32 | 10.7 | 2900.56 | 1039.58
LinearAlgebra/lapack (9) | started at 2023-04-12T12:16:28.183
LinearAlgebra/uniformscaling (8) | 47.64 | 3.37 | 7.1 | 3557.69 | 1105.19
LinearAlgebra/lq (8) | started at 2023-04-12T12:16:36.137
LinearAlgebra/hessenberg (2) | 47.76 | 3.17 | 6.6 | 3818.54 | 712.42
LinearAlgebra/adjtrans (2) | started at 2023-04-12T12:16:47.611
LinearAlgebra/svd (6) | 44.37 | 5.12 | 11.5 | 3351.13 | 1102.09
LinearAlgebra/generic (6) | started at 2023-04-12T12:16:49.421
LinearAlgebra/lapack (9) | 28.29 | 2.55 | 9.0 | 1628.27 | 1039.58
LinearAlgebra/schur (9) | started at 2023-04-12T12:16:56.510
LinearAlgebra/tridiag (5) | 47.93 | 4.71 | 9.8 | 2726.56 | 1441.88
LinearAlgebra/bunchkaufman (5) | started at 2023-04-12T12:16:59.965
LinearAlgebra/lq (8) | 33.69 | 3.54 | 10.5 | 1793.97 | 1105.19
LinearAlgebra/givens (8) | started at 2023-04-12T12:17:09.844
LinearAlgebra/lu (7) | 94.47 | 10.87 | 11.5 | 5976.82 | 1209.39
LinearAlgebra/pinv (7) | started at 2023-04-12T12:17:17.950
LinearAlgebra/adjtrans (2) | 31.05 | 3.38 | 10.9 | 2257.61 | 728.20
LinearAlgebra/factorization (2) | started at 2023-04-12T12:17:18.677
LinearAlgebra/eigen (4) | 68.53 | 7.26 | 10.6 | 4228.16 | 831.70
LinearAlgebra/abstractq (4) | started at 2023-04-12T12:17:19.739
LinearAlgebra/givens (8) | 10.21 | 1.82 | 17.8 | 397.91 | 1105.19
LinearAlgebra/ldlt (8) | started at 2023-04-12T12:17:20.074
LinearAlgebra/ldlt (8) | 1.06 | 0.00 | 0.0 | 61.72 | 1105.19
LinearAlgebra/factorization (2) | 4.06 | 0.49 | 12.0 | 304.59 | 815.39
LinearAlgebra/abstractq (4) | 3.86 | 0.24 | 6.2 | 331.80 | 913.75
LinearAlgebra/bunchkaufman (5) | 23.79 | 2.78 | 11.7 | 1370.65 | 1441.88
LinearAlgebra/pinv (7) | 6.97 | 0.75 | 10.7 | 855.20 | 1428.39
LinearAlgebra/generic (6) | 38.02 | 3.72 | 9.8 | 2491.66 | 1226.55
LinearAlgebra/schur (9) | 84.99 | 1.83 | 2.2 | 1404.24 | 1039.58
LinearAlgebra/triangular (3) | 306.81 | 18.81 | 6.1 | 33163.34 | 2196.91
Test Summary: | Pass Broken Total Time
Overall | 96483 17 96500 5m08.2s
SUCCESS
Test Summary: | Time
Full LinearAlgebra test suite | None 5m12.7s
Testing AppleAccelerate tests passed
Versus OpenBLAS:
Running parallel tests with:
nworkers() = 8
nthreads() = 1
Sys.CPU_THREADS = 8
Sys.total_memory() = 16.000 GiB
Sys.free_memory() = 2.436 GiB
Test (Worker) | Time (s) | GC (s) | GC % | Alloc (MB) | RSS (MB)
LinearAlgebra/diagonal (7) | started at 2023-04-12T12:19:43.415
LinearAlgebra/special (8) | started at 2023-04-12T12:19:43.479
LinearAlgebra/matmul (4) | started at 2023-04-12T12:19:43.582
LinearAlgebra/addmul (2) | started at 2023-04-12T12:19:43.582
LinearAlgebra/triangular (3) | started at 2023-04-12T12:19:43.583
LinearAlgebra/symmetric (6) | started at 2023-04-12T12:19:43.583
LinearAlgebra/dense (5) | started at 2023-04-12T12:19:43.583
LinearAlgebra/bidiag (9) | started at 2023-04-12T12:19:43.586
LinearAlgebra/special (8) | 101.46 | 3.33 | 3.3 | 13718.61 | 917.61
LinearAlgebra/qr (8) | started at 2023-04-12T12:21:25.276
LinearAlgebra/bidiag (9) | 114.20 | 3.99 | 3.5 | 13216.68 | 956.81
LinearAlgebra/cholesky (9) | started at 2023-04-12T12:21:37.868
LinearAlgebra/dense (5) | 145.96 | 5.71 | 3.9 | 17388.89 | 1106.66
LinearAlgebra/blas (5) | started at 2023-04-12T12:22:09.665
LinearAlgebra/diagonal (7) | 158.96 | 7.93 | 5.0 | 18475.02 | 1177.16
LinearAlgebra/lu (7) | started at 2023-04-12T12:22:22.653
LinearAlgebra/qr (8) | 57.88 | 3.25 | 5.6 | 6552.97 | 990.30
LinearAlgebra/uniformscaling (8) | started at 2023-04-12T12:22:23.163
LinearAlgebra/cholesky (9) | 56.94 | 3.20 | 5.6 | 5249.92 | 979.05
LinearAlgebra/structuredbroadcast (9) | started at 2023-04-12T12:22:34.828
LinearAlgebra/symmetric (6) | 175.63 | 7.43 | 4.2 | 18916.03 | 1136.31
LinearAlgebra/hessenberg (6) | started at 2023-04-12T12:22:39.359
LinearAlgebra/blas (5) | 34.75 | 2.57 | 7.4 | 2384.31 | 1226.16
LinearAlgebra/svd (5) | started at 2023-04-12T12:22:44.459
LinearAlgebra/matmul (4) | 183.24 | 7.50 | 4.1 | 21417.15 | 749.80
LinearAlgebra/eigen (4) | started at 2023-04-12T12:22:46.928
LinearAlgebra/structuredbroadcast (9) | 33.27 | 3.61 | 10.9 | 2900.77 | 979.05
LinearAlgebra/tridiag (9) | started at 2023-04-12T12:23:08.125
LinearAlgebra/hessenberg (6) | 32.13 | 2.63 | 8.2 | 2461.15 | 1136.31
LinearAlgebra/lapack (6) | started at 2023-04-12T12:23:11.506
LinearAlgebra/uniformscaling (8) | 48.48 | 3.70 | 7.6 | 3557.68 | 1007.66
LinearAlgebra/lq (8) | started at 2023-04-12T12:23:11.649
LinearAlgebra/svd (5) | 44.31 | 3.41 | 7.7 | 2903.87 | 1226.16
LinearAlgebra/adjtrans (5) | started at 2023-04-12T12:23:28.782
LinearAlgebra/lapack (6) | 24.98 | 2.24 | 9.0 | 1414.08 | 1136.31
LinearAlgebra/generic (6) | started at 2023-04-12T12:23:36.518
LinearAlgebra/lq (8) | 30.77 | 3.02 | 9.8 | 1794.01 | 1007.66
LinearAlgebra/schur (8) | started at 2023-04-12T12:23:42.448
LinearAlgebra/tridiag (9) | 40.00 | 3.86 | 9.7 | 2215.41 | 979.05
LinearAlgebra/bunchkaufman (9) | started at 2023-04-12T12:23:48.155
LinearAlgebra/eigen (4) | 63.84 | 6.05 | 9.5 | 4228.25 | 749.80
LinearAlgebra/givens (4) | started at 2023-04-12T12:23:50.788
LinearAlgebra/lu (7) | 96.01 | 10.01 | 10.4 | 5976.80 | 1177.16
LinearAlgebra/pinv (7) | started at 2023-04-12T12:23:58.674
LinearAlgebra/givens (4) | 8.93 | 0.79 | 8.9 | 498.21 | 749.80
LinearAlgebra/factorization (4) | started at 2023-04-12T12:23:59.746
LinearAlgebra/adjtrans (5) | 32.41 | 3.09 | 9.5 | 1977.57 | 1226.16
LinearAlgebra/abstractq (5) | started at 2023-04-12T12:24:01.227
LinearAlgebra/factorization (4) | 4.25 | 0.65 | 15.3 | 223.63 | 749.80
LinearAlgebra/ldlt (4) | started at 2023-04-12T12:24:04.039
LinearAlgebra/ldlt (4) | 1.40 | 0.00 | 0.0 | 70.48 | 749.80
LinearAlgebra/abstractq (5) | 6.95 | 2.07 | 29.8 | 283.06 | 1226.16
LinearAlgebra/pinv (7) | 10.68 | 2.38 | 22.3 | 855.16 | 1411.39
LinearAlgebra/generic (6) | 37.83 | 4.00 | 10.6 | 2510.75 | 1272.36
LinearAlgebra/bunchkaufman (9) | 30.87 | 2.65 | 8.6 | 2729.90 | 1360.27
LinearAlgebra/triangular (3) | 326.29 | 21.67 | 6.6 | 33163.40 | 2458.39
LinearAlgebra/schur (8) | 88.67 | 2.56 | 2.9 | 1484.38 | 1007.66
LinearAlgebra/addmul (2) | 420.11 | 13.89 | 3.3 | 37199.14 | 1532.12
Test Summary: | Pass Broken Total Time
Overall | 106833 17 106850 7m01.8s
SUCCESS
Although I do see that we run slightly more tests on OpenBLAS; not sure why that is.
As an update, macOS v13.4 beta 3 fixes the dsptrf bug; running the LinearAlgebra test suite with only Accelerate loaded (no external LAPACK) passes!
Wow that's quick. I suppose in that case the simplest thing is to make macOS 13.4 the min version and then remove all the LAPACK overlay stuff.
I am trying to run the ILP64 accelerate branch on MacOS 13.3.1 (on an M2 chip). I get an error when LBT tries to load lapack from the LAPACK_jll artifact. The error I get is:
Unable to autodetect interface type of "/Users/nicholasengelking/.julia/artifacts/65c65bc8413bbca96d1d988b65cdae3d9a64cedb/lib/liblapack.3.10.0.dylib"
This seems to indicate that there was an error in the autodetect_interface function in LBT that tries to determine if it's a 32 or 64 bit library.
I've tried up
ing LAPACK_jll and running Pkg.instantiate()
but no joy. I assume this is some kind of upstream issue with artifacts, packages, or LBT, or maybe the build of the LAPACK lib?
Any help would be appreciated. I am not on the 13.4 beta with the fix for dsptrf so my understanding is that I need to use this external LAPACK lib with Accelerate BLAS
This on the head of sf/ilp64_accelerate, commit d05a891
@Moblin88 This works for me. I just pushed an update for LAPACK 3.11 as well, and made that the minimum. Can you try it out?
Patch coverage: 82.50
% and project coverage change: +2.54
:tada:
Comparison is base (
c5186a7
) 80.26% compared to head (729a176
) 82.81%.:exclamation: Current head 729a176 differs from pull request most recent head e3753ce. Consider uploading reports for the commit e3753ce to get more accurate results
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
@staticfloat I have reinstated the earlier capabilities in this package and would like to merge this PR, if it looks good to you. The DSP and Array functions do not bring additional package dependencies, so are perhaps ok to leave here for now.
We can refactor this into more packages later, but removing the code felt like we would forget about it. It works fine and passes tests, and hopefully will help others build further.
It's working for me now on the master branch that was just merged with LAPACK 3.11.0. It's also WAYY faster to multiply large dense matrices!
We will be able to remove the LAPACK dependency once macos 13.4 is out.
Is it possible to do a new release of AppleAccelerate.jl?
My preference is to wait for macos 13.4 and remove the lapack dependency and then make a release. Would you prefer sooner?
No that's fine. I just wanted to add a comment about AppleAccelerate.jl in the documentation of JuliaHSL and explained that using AppleAccelerate
loads an LP64 BLAS/LAPACK like using MKL
.
This throws away most of the previous version, instead opting to re-architect this package to make use of LBT to transparently use Accelerate for BLAS and LAPACK operations. Further enhancements to re-introduce the DSP functionality can be made, potentially in a separate package if we want to keep this one lightweight, as it may end up at the bottom of many dependency trees.
This re-architecting causes Accelerate to pass the full LinearAlgebra test suite (thanks to the usage of an external LAPACK_jll to paper over bugs in
dsptrf()
; hopefully no longer necessary in a future macOS update).Fixes #45