In-place Assembly of PSparseMatrices

JordiManyer commented 11 months ago

@amartinhuertas I have found a potential inefficiency in the in-place assembly of distributed linear systems.

Some context:

In Gridap, we declare the function Algebra.add_entry, which is used to assemble an (i,j,v) triplet into different structures (counters, COOs and matrices). As usual, there are different specializations of this function. In particular, we have different functions for AbstractMatrix and AbstractSparseMatrix:

function add_entry!(combine::Function,A::AbstractSparseMatrix,v::Number,i,j)
  k = nz_index(A,i,j)
  nz = nonzeros(A)
  Aij = nz[k]
  nz[k] = combine(v,Aij)
  A
end

@inline function add_entry!(combine::Function,A::AbstractMatrix,v,i,j)
  aij = A[i,j]
  A[i,j] = combine(aij,v)
  A
end

BUG 1: As we can see, for AbstractMatrix we call combine(aij,v) whereas for AbstractSparseMatrix we call combine(v,aij). This leads to different behaviors when combine is not commutative. For instance, (a,b) -> a would create different matrices depending on the type.

Due to the above bug, I have also uncovered something I believe is quite bad: When in-place assembling distributed matrices, we assemble the local contributions by using a local view of the global matrix. This view is given by the function local_views(A,rows,cols), which returns an object of type

  LocalView{T,N,A,B} <:AbstractArray{T,N}

This means that the local portion of a PSparseMatrix will be assembled as an AbstractMatrix, and NOT as an AbstractSparseMatrix, therefore triggering the add_entry! version for AbstractMatrix, which tries to access the matrix entry directly instead of using a sparse-specific access pattern like nz_index(A,i,j).

My concern is the following: Is this efficient?

amartinhuertas commented 11 months ago

My concern is the following: Is this efficient?

It depends on how A[i,j] is implemented for the particular type of AbstractSparseMatrix (e.g., SparseMatrixCSC)

nz_index is a function which we have defined, and that uses binary search to locate an element within a sparse column (CSC) or row (CSR) . Thus, O(log(n)) complexity. I would say this is the best you can do, and not sure whether A[i,j] does or not.

In any case, I think that if we match the local numbering of the FE space DOFs and that of the local linear system, then LocalView will no longer be needed.

amartinhuertas commented 11 months ago

btw I agree there is a bug, and should be solved in Gridap

fverdugo commented 11 months ago

Hi @JordiManyer

A[i,j] = v also uses binary search for A::SparseMatricCSC

JordiManyer commented 11 months ago

Hi @amartinhuertas , @fverdugo , I'll then go ahead and fix the bug in Gridap.

gridap / GridapDistributed.jl

In-place Assembly of PSparseMatrices #138