slimgroup / ParametricOperators.jl

Scalable and distributed matrix-free abstractions for machine learning and scientific computing
MIT License
4 stars 4 forks source link

`A(θ)' * y` for `ParMatrix` errors #6

Open ziyiyin97 opened 1 year ago

ziyiyin97 commented 1 year ago
using ParametricOperators

A = ParMatrix(Float32, 4, 5)
θ = init(A)
x = randn(Float32, 5)
y = randn(Float32, 4)
output1 = A(θ) * x      # this works
output2 = A(θ)' * y     # this doesn't

Error log:

ERROR: LoadError: ArgumentError: invalid index: ParMatrix{Float32}(4, 5, UUID("93f79dbc-05db-4fb1-9d23-6248e5dfcf25")) of type ParMatrix{Float32}
Stacktrace:
  [1] to_index(i::ParMatrix{Float32})
    @ Base ./indices.jl:300
  [2] to_index(A::Matrix{Float32}, i::ParMatrix{Float32})
    @ Base ./indices.jl:277
  [3] to_indices
    @ ./indices.jl:333 [inlined]
  [4] to_indices
    @ ./indices.jl:325 [inlined]
  [5] getindex
    @ ./abstractarray.jl:1241 [inlined]
  [6] (::ParParameterized{Float32, Float32, ParametricOperators.Linear, ParAdjoint{Float32, Float32, ParametricOperators.Parametric, ParMatrix{Float32}}, Matrix{Float32}})(x::Vector{Float32})
    @ ParametricOperators ~/.julia/dev/ParametricOperators/src/ParMatrix.jl:31
  [7] *(A::ParParameterized{Float32, Float32, ParametricOperators.Linear, ParAdjoint{Float32, Float32, ParametricOperators.Parametric, ParMatrix{Float32}}, Matrix{Float32}}, x::Vector{Float32})
    @ ParametricOperators ~/.julia/dev/ParametricOperators/src/ParOperator.jl:221
  [8] top-level scope
    @ ~/.julia/dev/ParametricOperators/test/MFE.jl:8
  [9] include(fname::String)
    @ Base.MainInclude ./client.jl:476
 [10] top-level scope
    @ REPL[1]:1
in expression starting at /Users/ziyiyin/.julia/dev/ParametricOperators/test/MFE.jl:8
ziyiyin97 commented 1 year ago

This is a reduced version from the test script https://github.com/slimgroup/ParametricOperators.jl/blob/main/test/serial.jl

turquoisedragon2926 commented 1 year ago

Hey Francis, you can find the adjoint like so:

output2 = A'(θ) * y

ziyiyin97 commented 1 year ago

Actually

julia> Aθ = A(θ)
ParParameterized{Float32, Float32, ParametricOperators.Linear, ParMatrix{Float32}, Matrix{Float32}}(ParMatrix{Float32}(4, 5, UUID("5077dc2d-f986-4e84-b9da-c65c5a991f9a")), Float32[0.13134196 0.1615439 … 0.05156738 0.054195553; 0.103010565 0.07147584 … 0.11180936 0.041337255; 0.033035975 0.18751591 … 0.113301255 0.13448314; 0.06092713 0.13628592 … 0.13177578 0.20584196])

julia> Aθ * x
4-element Vector{Float32}:
 -0.18058027
  0.07688203
 -0.15836024
 -0.11452815

julia> Aθ' * y
ERROR: ArgumentError: invalid index: ParMatrix{Float32}(4, 5, UUID("5077dc2d-f986-4e84-b9da-c65c5a991f9a")) of type ParMatrix{Float32}
Stacktrace:
 [1] to_index(i::ParMatrix{Float32})
   @ Base ./indices.jl:300
 [2] to_index(A::Matrix{Float32}, i::ParMatrix{Float32})
   @ Base ./indices.jl:277
 [3] to_indices
   @ ./indices.jl:333 [inlined]
 [4] to_indices
   @ ./indices.jl:325 [inlined]
 [5] getindex
   @ ./abstractarray.jl:1241 [inlined]
 [6] (::ParParameterized{Float32, Float32, ParametricOperators.Linear, ParAdjoint{Float32, Float32, ParametricOperators.Parametric, ParMatrix{Float32}}, Matrix{Float32}})(x::Vector{Float32})
   @ ParametricOperators ~/.julia/dev/ParametricOperators/src/ParMatrix.jl:31
 [7] *(A::ParParameterized{Float32, Float32, ParametricOperators.Linear, ParAdjoint{Float32, Float32, ParametricOperators.Parametric, ParMatrix{Float32}}, Matrix{Float32}}, x::Vector{Float32})
   @ ParametricOperators ~/.julia/dev/ParametricOperators/src/ParOperator.jl:221
 [8] top-level scope
   @ REPL[9]:1
 [9] top-level scope
   @ ~/.julia/packages/CUDA/BbliS/src/initialization.jl:52
turquoisedragon2926 commented 1 year ago

So here, doing:

Aθ = A(θ)
Aθ' * y

is the same as doing

A(θ)' * y

instead if you do

adj = A'
adj(θ) * y

that will work

ziyiyin97 commented 1 year ago

I see. Thanks! Would be good to make sure it works both ways because right now you can't do LSQR for example

turquoisedragon2926 commented 1 year ago

can u point me to an implementation of LSQR that we would like to replicate, thank you francis

ziyiyin97 commented 1 year ago

I would first make sure the adjoint test/linearization test pass. A simple lsqr is shown below

julia> using IterativeSolvers

julia> A = randn(3,3)
3×3 Matrix{Float64}:
 -0.274139   0.525287  -1.44593
  1.62228    1.00433   -0.0654527
  0.291649  -0.928884  -1.23786

julia> b = randn(3)
3-element Vector{Float64}:
 -0.6931480787837941
  1.1481589371152996
 -0.08422685441572153

julia> result = lsqr(A, b)
3-element Vector{Float64}:
  0.7673748585872509
 -0.07636631811313134
  0.3061462307037783

It would be great if we can have a variable projection algorithm implemented in a clean way with this package. Thoughts?

turquoisedragon2926 commented 1 year ago

I agree automating tests should be done before

My thoughts were to keep ParametricOperators.jl as a raw package library that supports very fundamental operations and create a new package ParametricOperatorsML.jl that supports more higher level operations such as lsqr, batch norm, and any tools needed for machine learning that uses the base package.

I think this is good software engineering since it creates a separation of concern, what do you think francis?

mloubout commented 1 year ago

Lsqr cannot be an extra or an different package, it's probably the number one thing this package has to support as a linear algebra abtraction. The adjoint must be defined here correctly. Check JOLI.jl or LinearMap.jl for reference.