JuliaLinearAlgebra / BLIS.jl

This repo plans to provide a low-level Julia wrapper for BLIS typed interface.
BSD 3-Clause "New" or "Revised" License
26 stars 4 forks source link

Crash on multiplying large dense matrices #12

Closed jd-foster closed 2 years ago

jd-foster commented 2 years ago

This runs in the base environment in a new Julia session:

  | | |_| | | | (_| |  |  Version 1.7.1 (2021-12-22)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> n = 1000; a = fill(1., n, n); a*a;

However:

julia> using BLIS
[ Info: blis_jll yields BLIS installation: ~/.julia/artifacts/567e1b2234ebc730bc16e33871144bf9b561f1fa/lib/libblis.4.0.0.dylib.

julia> n = 1000; a = fill(1., n, n); a*a;

signal (10): Bus error: 10
in expression starting at REPL[4]:1
jl_system_image_data at /Applications/Julia-1.7.app/Contents/Resources/julia/lib/julia/sys.dylib (unknown line)
Allocations: 12825733 (Pool: 12821375; Big: 4358); GC: 13
[1]    4388 bus error  julia
xrq-phys commented 2 years ago

It seems to be a x86_64 mac with latest v0.8.1+2 binary.

I'd like to know more about the CPU you are using since the same binary is working without this problem under M1+Rosetta-2:

  | | |_| | | | (_| |  |  Version 1.6.1 (2021-04-23)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> using BLIS
[ Info: blis_jll yields BLIS installation: /Users/rubidium/.julia-ia/artifacts/567e1b2234ebc730bc16e33871144bf9b561f1fa/lib/libblis.4.0.0.dylib.

julia> n = 1000; a = fill(1., n, n); a*a;

julia>
jd-foster commented 2 years ago

Interestingly, I can't reproduce this on the same machine under v1.6.1, only for v1.7.1.

(Also, not sure if relevant, but Sys.MACHINE in v1.6.1 gives x86_64-apple-darwin18.7.0 while Sys.MACHINE in v1.7.1 gives x86_64-apple-darwin19.5.0. I guess these are the versions that the binaries were built under. Theuname of my machine is Darwin Kernel Version 19.6.0.)

jd-foster commented 2 years ago

It seems to be a x86_64 mac with latest v0.8.1+2 binary.

Yes, that's right.

fangzhou-xie commented 2 years ago

I am on x86_64 (Intel) macOS with julia 1.7.2 here. And I have the same error:

               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.7.2 (2022-02-06)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> using BLIS
[ Info: blis_jll yields BLIS installation: /Users/xiefangzhou/.julia/artifacts/cc15acec1f5320a9559756c8186ce2df4313bbc6/lib/libblis.4.0.0.dylib.

julia> A = randn(1000, 1000);

julia> A * A;

signal (10): Bus error: 10
in expression starting at REPL[3]:1
jl_system_image_data at /Applications/Julia-1.7.app/Contents/Resources/julia/lib/julia/sys.dylib (unknown line)
Allocations: 4813021 (Pool: 4812021; Big: 1000); GC: 5
zsh: bus error  julia
xrq-phys commented 2 years ago

Finally got access to my x86 mac in office. I'm now able to reproduce the error. It seems a problem within this package. Simply setting BLIS as BLAS provider via lbt_forward is okay. The problem is in the extended API:

--- /Applications » BLIS_JR_NT=4 julia

  | | |_| | | | (_| |  |  Version 1.7.2 (2022-02-06)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> using LinearAlgebra, BenchmarkTools, blis_jll

julia> @benchmark a * a setup=(n = 1000; a = fill(1., n, n);)
BenchmarkTools.Trial: 336 samples with 1 evaluation.
 Range (min … max):  12.276 ms … 63.016 ms  ┊ GC (min … max): 0.00% … 80.32%
 Time  (median):     13.018 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   14.019 ms ±  3.840 ms  ┊ GC (mean ± σ):  7.96% ±  9.72%
...

 Memory estimate: 7.63 MiB, allocs estimate: 2.

julia> BLAS.lbt_forward(blis, clear=true);

julia> @benchmark a * a setup=(n = 1000; a = fill(1., n, n);)
BenchmarkTools.Trial: 321 samples with 1 evaluation.
 Range (min … max):  12.948 ms … 120.029 ms  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     13.287 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   14.662 ms ±   7.168 ms  ┊ GC (mean ± σ):  6.60% ± 8.65%
...

 Memory estimate: 7.63 MiB, allocs estimate: 2.

julia> ENV["BLIS_JR_NT"] = 1
1

julia> @benchmark a * a setup=(n = 1000; a = fill(1., n, n);) # Performance is affected by BLIS_JR_NT, indicating BLIS' indeed called here.
BenchmarkTools.Trial: 122 samples with 1 evaluation.
 Range (min … max):  13.242 ms … 303.030 ms  ┊ GC (min … max): 0.00% …  1.42%
 Time  (median):     22.673 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   40.469 ms ±  47.515 ms  ┊ GC (mean ± σ):  4.27% ± 10.51%
...

 Memory estimate: 7.63 MiB, allocs estimate: 2.

julia> using BLIS
[ Info: blis_jll yields BLIS installation: $HOME/.julia/artifacts/cc15acec1f5320a9559756c8186ce2df4313bbc6/lib/libblis.4.0.0.dylib.

julia> @benchmark a * a setup=(n = 1000; a = fill(1., n, n);)
[1]    5656 bus error  BLIS_JR_NT=4 julia
xrq-phys commented 2 years ago

Sorry I made a mistake it is indeed a libblis_jll problem now fixed in v0.9.0.

I'll check & upgrade dependencies. Thanks for reporting!

jd-foster commented 2 years ago

Thanks for determining the source of the issue.

fangzhou-xie commented 2 years ago

Thanks for investigating into this issue! I discovered later that this issue was also on my Linux (Fedora 35) machine.

               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.7.2 (2022-02-06)
 _/ |\__'_|_|_|\__'_|  |  Fedora 35 build
|__/                   |

julia> using BLIS
[ Info: blis_jll yields BLIS installation: ~/.julia/artifacts/c678808668b42fcfc33cc9ee184e6dd4049378b0/lib/libblis.so.

julia> A = randn(1000, 1000);

julia> A * A;

signal (11): Segmentation fault
in expression starting at REPL[3]:1
jl_system_image_data at /usr/lib64/julia/sys.so (unknown line)
Allocations: 3548382 (Pool: 3547889; Big: 493); GC: 3
Segmentation fault (core dumped)

I would believe that this is the same issue as on macOS and the fix would work on both macOS and Linux?