slimgroup / ParametricOperators.jl

Scalable and distributed matrix-free abstractions for machine learning and scientific computing
MIT License
3 stars 4 forks source link

`A(θ)' * y` for `ParMatrix` errors #6

Open ziyiyin97 opened 11 months ago

ziyiyin97 commented 11 months ago
using ParametricOperators

A = ParMatrix(Float32, 4, 5)
θ = init(A)
x = randn(Float32, 5)
y = randn(Float32, 4)
output1 = A(θ) * x      # this works
output2 = A(θ)' * y     # this doesn't

Error log:

ERROR: LoadError: ArgumentError: invalid index: ParMatrix{Float32}(4, 5, UUID("93f79dbc-05db-4fb1-9d23-6248e5dfcf25")) of type ParMatrix{Float32}
Stacktrace:
  [1] to_index(i::ParMatrix{Float32})
    @ Base ./indices.jl:300
  [2] to_index(A::Matrix{Float32}, i::ParMatrix{Float32})
    @ Base ./indices.jl:277
  [3] to_indices
    @ ./indices.jl:333 [inlined]
  [4] to_indices
    @ ./indices.jl:325 [inlined]
  [5] getindex
    @ ./abstractarray.jl:1241 [inlined]
  [6] (::ParParameterized{Float32, Float32, ParametricOperators.Linear, ParAdjoint{Float32, Float32, ParametricOperators.Parametric, ParMatrix{Float32}}, Matrix{Float32}})(x::Vector{Float32})
    @ ParametricOperators ~/.julia/dev/ParametricOperators/src/ParMatrix.jl:31
  [7] *(A::ParParameterized{Float32, Float32, ParametricOperators.Linear, ParAdjoint{Float32, Float32, ParametricOperators.Parametric, ParMatrix{Float32}}, Matrix{Float32}}, x::Vector{Float32})
    @ ParametricOperators ~/.julia/dev/ParametricOperators/src/ParOperator.jl:221
  [8] top-level scope
    @ ~/.julia/dev/ParametricOperators/test/MFE.jl:8
  [9] include(fname::String)
    @ Base.MainInclude ./client.jl:476
 [10] top-level scope
    @ REPL[1]:1
in expression starting at /Users/ziyiyin/.julia/dev/ParametricOperators/test/MFE.jl:8
ziyiyin97 commented 11 months ago

This is a reduced version from the test script https://github.com/slimgroup/ParametricOperators.jl/blob/main/test/serial.jl

turquoisedragon2926 commented 11 months ago

Hey Francis, you can find the adjoint like so:

output2 = A'(θ) * y

ziyiyin97 commented 11 months ago

Actually

julia> Aθ = A(θ)
ParParameterized{Float32, Float32, ParametricOperators.Linear, ParMatrix{Float32}, Matrix{Float32}}(ParMatrix{Float32}(4, 5, UUID("5077dc2d-f986-4e84-b9da-c65c5a991f9a")), Float32[0.13134196 0.1615439 … 0.05156738 0.054195553; 0.103010565 0.07147584 … 0.11180936 0.041337255; 0.033035975 0.18751591 … 0.113301255 0.13448314; 0.06092713 0.13628592 … 0.13177578 0.20584196])

julia> Aθ * x
4-element Vector{Float32}:
 -0.18058027
  0.07688203
 -0.15836024
 -0.11452815

julia> Aθ' * y
ERROR: ArgumentError: invalid index: ParMatrix{Float32}(4, 5, UUID("5077dc2d-f986-4e84-b9da-c65c5a991f9a")) of type ParMatrix{Float32}
Stacktrace:
 [1] to_index(i::ParMatrix{Float32})
   @ Base ./indices.jl:300
 [2] to_index(A::Matrix{Float32}, i::ParMatrix{Float32})
   @ Base ./indices.jl:277
 [3] to_indices
   @ ./indices.jl:333 [inlined]
 [4] to_indices
   @ ./indices.jl:325 [inlined]
 [5] getindex
   @ ./abstractarray.jl:1241 [inlined]
 [6] (::ParParameterized{Float32, Float32, ParametricOperators.Linear, ParAdjoint{Float32, Float32, ParametricOperators.Parametric, ParMatrix{Float32}}, Matrix{Float32}})(x::Vector{Float32})
   @ ParametricOperators ~/.julia/dev/ParametricOperators/src/ParMatrix.jl:31
 [7] *(A::ParParameterized{Float32, Float32, ParametricOperators.Linear, ParAdjoint{Float32, Float32, ParametricOperators.Parametric, ParMatrix{Float32}}, Matrix{Float32}}, x::Vector{Float32})
   @ ParametricOperators ~/.julia/dev/ParametricOperators/src/ParOperator.jl:221
 [8] top-level scope
   @ REPL[9]:1
 [9] top-level scope
   @ ~/.julia/packages/CUDA/BbliS/src/initialization.jl:52
turquoisedragon2926 commented 11 months ago

So here, doing:

Aθ = A(θ)
Aθ' * y

is the same as doing

A(θ)' * y

instead if you do

adj = A'
adj(θ) * y

that will work

ziyiyin97 commented 11 months ago

I see. Thanks! Would be good to make sure it works both ways because right now you can't do LSQR for example

turquoisedragon2926 commented 11 months ago

can u point me to an implementation of LSQR that we would like to replicate, thank you francis

ziyiyin97 commented 11 months ago

I would first make sure the adjoint test/linearization test pass. A simple lsqr is shown below

julia> using IterativeSolvers

julia> A = randn(3,3)
3×3 Matrix{Float64}:
 -0.274139   0.525287  -1.44593
  1.62228    1.00433   -0.0654527
  0.291649  -0.928884  -1.23786

julia> b = randn(3)
3-element Vector{Float64}:
 -0.6931480787837941
  1.1481589371152996
 -0.08422685441572153

julia> result = lsqr(A, b)
3-element Vector{Float64}:
  0.7673748585872509
 -0.07636631811313134
  0.3061462307037783

It would be great if we can have a variable projection algorithm implemented in a clean way with this package. Thoughts?

turquoisedragon2926 commented 11 months ago

I agree automating tests should be done before

My thoughts were to keep ParametricOperators.jl as a raw package library that supports very fundamental operations and create a new package ParametricOperatorsML.jl that supports more higher level operations such as lsqr, batch norm, and any tools needed for machine learning that uses the base package.

I think this is good software engineering since it creates a separation of concern, what do you think francis?

mloubout commented 11 months ago

Lsqr cannot be an extra or an different package, it's probably the number one thing this package has to support as a linear algebra abtraction. The adjoint must be defined here correctly. Check JOLI.jl or LinearMap.jl for reference.