So far RegularizedLeastSquares.jl offers support for two types of multi-threading which are both transparent to the package.
The highest-level of multi-threading is just when a user creates multiple solvers in parallel (potentially with different parameters):
Threads.@threads for (i, A) in enumerate(operators)
solver = createLinearSolver(CGNR, A, iterations = ...)
push!(results, solve!(solver, b[:, i]))
end
Then we also support low-level multi-threading within the linear operators given to the package. This is just completely transparent to us. MRIReco for example uses both of these multi-threading options (with @floop instead of the Threads version).
This PR takes advantage of the split between a solver and its state to add middle-level of multi-threading. This level of multi-threading applies the same solver (and its parameters) to multiple measurement vectors or rather a measurement matrix B.
Solvers and their state are currently defined as follows:
mutable struct Solver{matT, ...}
A::matT
# Other "static" fields
state::AbstractSolverState{<:Solver}
end
mutable struct SolverState{T, tempT} <: AbstractSolverState{Solver}
x::tempT
rho::T
# ...
iteration::Int64
end
While both the solver and its state are mutable structs, the solver struct is intended to be immutable during a solve! call. This allows us to simply copy the state, initialize each state with a slice of B and then perform iterations on each state separately. Since the solver has an AbstractSolverState as a field, we can exchange this for a new state holding all the usual states for each slice. While this is technically a type-instability, in the iterate method we always dispatch on the state-type too, so we are not affected by the type-instability in our hot-loops.
To make this feature optional and hackable to users, I've introduced a new keyword scheduler to the solve! call which gets passed on to the init! call if a user provides a measurement matrix. Scheduler defaults to a simple sequential implementation without multi-threading. Out of the box we also support multi-threading via Threads.@threads with scheduler = MultiThreadingState. The scheduler is treated like a callable that gets invoked with a copy of all necessary states.
As an example I sketch how to implement a custom @floop scheduler:
# Struct definition, we track both the states and if they are still active. Since most solver have conv. criteria, they can finish at different iteration numbers
# Atm the generic functions expect the scheduler structs to have a `states` and `active` property.
mutable struct FloopState{S, ST <: AbstractSolverState{S}} <: AbstractMatrixSolverState{S}
states::Vector{ST}
active::Vector{Bool}
FloopState(states::Vector{ST}) where {S, ST <: AbstractSolverState{S}} = new{S, ST}(states, fill(true, length(states)))
end
# To hook into the existing init! code we only have to supply a method that gets a copyable "vector" state. This will invoke our FloopState constructor with copies of the given state.
prepareMultiStates(solver::AbstractLinearSolver, state::FloopState, b::AbstractMatrix) = prepareMultiStates(solver, first(state.states), b)
# This iterate function is called with the idx of still active states
function iterate(solver::AbstractLinearSolver, state::FloopState, activeIdx)
@floop for i in activeIdx
res = iterate(solver, state.states[i])
if isnothing(res)
state.active[i] = false
end
end
return state.active, state
end
# To call this we now simply have to do
solver = createLinearSolver(CGNR, A, ...)
solve!(solver, B; scheduler = FloopState)
A solver is also not locked into always doing multi-threading once it was passed a matrix. It seamlessly switches back to normal execution when given a vector.
Note that switching out the state as is done here, also allows one to specify special variants of a solver as is done in this PR.
Furthermore, we can also now specify special algorithms variants that work directly on B and not slices of it. An example of this could be an implementation of Kaczmarz as described in this work.
So far RegularizedLeastSquares.jl offers support for two types of multi-threading which are both transparent to the package.
The highest-level of multi-threading is just when a user creates multiple solvers in parallel (potentially with different parameters):
Then we also support low-level multi-threading within the linear operators given to the package. This is just completely transparent to us. MRIReco for example uses both of these multi-threading options (with
@floop
instead of the Threads version).This PR takes advantage of the split between a solver and its state to add middle-level of multi-threading. This level of multi-threading applies the same solver (and its parameters) to multiple measurement vectors or rather a measurement matrix
B
.Solvers and their state are currently defined as follows:
While both the solver and its state are mutable structs, the solver struct is intended to be immutable during a
solve!
call. This allows us to simply copy the state, initialize each state with a slice ofB
and then perform iterations on each state separately. Since the solver has anAbstractSolverState
as a field, we can exchange this for a new state holding all the usual states for each slice. While this is technically a type-instability, in the iterate method we always dispatch on the state-type too, so we are not affected by the type-instability in our hot-loops.To make this feature optional and hackable to users, I've introduced a new keyword
scheduler
to thesolve!
call which gets passed on to theinit!
call if a user provides a measurement matrix. Scheduler defaults to a simple sequential implementation without multi-threading. Out of the box we also support multi-threading viaThreads.@threads
withscheduler = MultiThreadingState
. The scheduler is treated like a callable that gets invoked with a copy of all necessary states.As an example I sketch how to implement a custom
@floop
scheduler:A solver is also not locked into always doing multi-threading once it was passed a matrix. It seamlessly switches back to normal execution when given a vector.
Note that switching out the state as is done here, also allows one to specify special variants of a solver as is done in this PR.
Furthermore, we can also now specify special algorithms variants that work directly on
B
and not slices of it. An example of this could be an implementation of Kaczmarz as described in this work.