odsl-team / julia-ml-from-scratch

Machine learning from scratch in Julia
Other
31 stars 1 forks source link

Performance on different computing platforms #2

Open oschulz opened 1 year ago

oschulz commented 1 year ago

ANN training time with (commit 61c1923)

learn_schedule = [
    (batchsize = 1000, optimizer = GradientDecent(0.1), epochs = 1),
    (batchsize = 5000, optimizer = GradientDecent(0.05), epochs = 1),
    (batchsize = 50000, optimizer = GradientDecent(0.025), epochs = 1),
]

10500000 ANN input evaluations in total:

christiangnrd commented 1 year ago

At some point during the computation with Metal, I had 29.45GB of memory used, and 71.14 GB of Swap, the Julia process was using 90GB of memory.

Also this error which I mentioned on Slack which I hacked a fix by just adapting the model and x to be on cpu.

Error:

julia> Y_train_v = Array(vec(batched_eval(m_trained, X_train)))
ERROR: ArgumentError: cannot take the CPU address of a MtlMatrix{Float32}
Stacktrace:
  [1] unsafe_convert(#unused#::Type{Ptr{Float32}}, x::MtlMatrix{Float32})
    @ Metal ~/.julia/packages/Metal/TtPHW/src/array.jl:121
  [2] gemm!(transA::Char, transB::Char, alpha::Float32, A::MtlMatrix{Float32}, B::SubArray{Float32, 2, MtlMatrix{Float32}, Tuple{Base.Slice{Base.OneTo{Int64}}, UnitRange{Int64}}, true}, beta::Float32, C::MtlMatrix{Float32})
    @ LinearAlgebra.BLAS ~/.julia/juliaup/julia-1.8.5+0.aarch64.apple.darwin14/share/julia/stdlib/v1.8/LinearAlgebra/src/blas.jl:1514
  [3] gemm_wrapper!(C::MtlMatrix{Float32}, tA::Char, tB::Char, A::MtlMatrix{Float32}, B::SubArray{Float32, 2, MtlMatrix{Float32}, Tuple{Base.Slice{Base.OneTo{Int64}}, UnitRange{Int64}}, true}, _add::LinearAlgebra.MulAddMul{true, true, Bool, Bool})
    @ LinearAlgebra ~/.julia/juliaup/julia-1.8.5+0.aarch64.apple.darwin14/share/julia/stdlib/v1.8/LinearAlgebra/src/matmul.jl:674
  [4] mul!
    @ ~/.julia/juliaup/julia-1.8.5+0.aarch64.apple.darwin14/share/julia/stdlib/v1.8/LinearAlgebra/src/matmul.jl:161 [inlined]
  [5] mul!
    @ ~/.julia/juliaup/julia-1.8.5+0.aarch64.apple.darwin14/share/julia/stdlib/v1.8/LinearAlgebra/src/matmul.jl:276 [inlined]
  [6] *
    @ ~/.julia/juliaup/julia-1.8.5+0.aarch64.apple.darwin14/share/julia/stdlib/v1.8/LinearAlgebra/src/matmul.jl:148 [inlined]
  [7] Fix1
    @ ./operators.jl:1096 [inlined]
  [8] (::ComposedFunction{ComposedFunction{ComposedFunction{ComposedFunction{ComposedFunction{BroadcastFunction{typeof(logistic)}, ComposedFunction{Fix1{BroadcastFunction{typeof(+)}, MtlVector{Float32}}, Fix1{typeof(*), MtlMatrix{Float32}}}}, BroadcastFunction{typeof(relu)}}, ComposedFunction{Fix1{BroadcastFunction{typeof(+)}, MtlVector{Float32}}, Fix1{typeof(*), MtlMatrix{Float32}}}}, BroadcastFunction{typeof(relu)}}, ComposedFunction{Fix1{BroadcastFunction{typeof(+)}, MtlVector{Float32}}, Fix1{typeof(*), MtlMatrix{Float32}}}})(x::SubArray{Float32, 2, MtlMatrix{Float32}, Tuple{Base.Slice{Base.OneTo{Int64}}, UnitRange{Int64}}, true}; kw::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ Base ./operators.jl:1035
  [9] ComposedFunction
    @ ./operators.jl:1033 [inlined]
 [10] batched_eval(m::ComposedFunction{ComposedFunction{ComposedFunction{ComposedFunction{ComposedFunction{BroadcastFunction{typeof(logistic)}, ComposedFunction{Fix1{BroadcastFunction{typeof(+)}, MtlVector{Float32}}, Fix1{typeof(*), MtlMatrix{Float32}}}}, BroadcastFunction{typeof(relu)}}, ComposedFunction{Fix1{BroadcastFunction{typeof(+)}, MtlVector{Float32}}, Fix1{typeof(*), MtlMatrix{Float32}}}}, BroadcastFunction{typeof(relu)}}, ComposedFunction{Fix1{BroadcastFunction{typeof(+)}, MtlVector{Float32}}, Fix1{typeof(*), MtlMatrix{Float32}}}}, X::MtlMatrix{Float32}; batchsize::Int64)
    @ Main ~/julia-ml-from-scratch/ml_from_scratch.jl:452
 [11] batched_eval(m::Function, X::MtlMatrix{Float32})
    @ Main ~/julia-ml-from-scratch/ml_from_scratch.jl:447
 [12] top-level scope
    @ REPL[14]:1
 [13] top-level scope
    @ ~/.julia/packages/Metal/TtPHW/src/initialization.jl:46
ViralBShah commented 1 year ago

I assume the Metal.jl issue will be opened in the Metal.jl repo.

oschulz commented 1 year ago

@ViralBShah I assume the Metal.jl issue will be opened in the Metal.jl repo.

@maleadt , should I open a Metal.jl issue as well?

maleadt commented 1 year ago

As mentioned on Slack I normally prefer more specific/concrete issues -- less actionable ones tend to get forgotten -- but we can always use a tracking issue on the Metal.jl repo, sure.

oschulz commented 1 year ago

Yep, I'm afraid I don't have more specific stuff at this point, haven't done any in-depth profiling (and I'm so not a Metal.jl expert). Maybe @christiangnrd can open some more concrete issues - do you plan to investigate this a bit more on Apple Silicon, @christiangnrd ?

christiangnrd commented 1 year ago

I'll try to investigate a bit but I'm very new to GPU programming so I might only be able to help with surface-level things.

oschulz commented 1 year ago

No worries, I'm sure any kind of contribution will be very much appreciated!