Open rusandris opened 8 months ago
Difference between different conversion methods (from COO
to CSC
): 100×100
SparseMatrix with 280 stored entries
@time sparse(trs.Q)
11.526183 seconds (12 allocations: 16.016 KiB)
@time SparseMatrixCSC(trs.Q)
0.016892 seconds (13 allocations: 15.266 MiB)
COO
always makes fewer number of allocations, and it can be much faster when dealing with bigger matrices. The reason for this is stated in the documentation of SparseArrays:
The compressed sparse column storage makes it easy and quick to access the elements in the column of a sparse matrix, whereas accessing the sparse matrix by rows is considerably slower. Operations such as insertion of previously unstored entries one at a time in the CSC structure tend to be slow. This is because all elements of the sparse matrix that are beyond the point of insertion have to be moved one place over.
Tests:
#construct sparse matrices by adding random elements at random indices
function assemble_sparse_CSC(N,row_idcs,col_idcs)
M = spzeros(N,N)
n = length(row_idcs)
for e in 1:n
i = row_idcs[e]
j = col_idcs[e]
M[i,j] = rand()
end
end
function assemble_sparse_COO(N,row_idcs,col_idcs)
M = SparseMatrixCOO{Float64, Int64}(Int64[], Int64[], Float64[],N,N)
n = length(row_idcs)
for e in 1:n
i = row_idcs[e]
j = col_idcs[e]
M[i,j] = rand()
end
end
begin
N = 20
n = 100000
row_idcs = rand(1:N,n)
col_idcs = rand(1:N,n)
@time assemble_sparse_CSC(N,row_idcs,col_idcs)
@time assemble_sparse_COO(N,row_idcs,col_idcs)
end
0.006734 seconds (200.01 k allocations: 6.119 MiB)
0.002575 seconds (34 allocations: 5.483 MiB)
begin
N = 1000
n = 100000
row_idcs = rand(1:N,n)
col_idcs = rand(1:N,n)
@time assemble_sparse_CSC(N,row_idcs,col_idcs)
@time assemble_sparse_COO(N,row_idcs,col_idcs)
end
0.341956 seconds (298.00 k allocations: 11.262 MiB)
0.006214 seconds (34 allocations: 5.483 MiB, 37.49% gc time)
Corrections.
These benchmarks are closer to what the code actually does when building a transition matrix:
#addition instead of overwrite
function assemble_sparse_CSC(N,row_idcs,col_idcs)
M = spzeros(N,N)
n = length(row_idcs)
for e in 1:n
i = row_idcs[e]
j = col_idcs[e]
M[i,j] += 1.0
end
end
assemble_sparse_CSC (generic function with 1 method)
function assemble_sparse_COO(N,row_idcs,col_idcs)
M = SparseMatrixCOO{Float64, Int64}(Int64[], Int64[], Float64[],N,N)
n = length(row_idcs)
for e in 1:n
i = row_idcs[e]
j = col_idcs[e]
M[i,j] = 1.0
end
end
Interesting observation: benchmarks with non-random indices (keeping the time correlation, successive states) show a different picture:
orbit,t = trajectory(system,10^7;Ttr=10^3)
sts = timeseries_to_grid(orbit, 10)
begin
N = 10
n = length(sts)
row_idcs = sts[1:end-1]
col_idcs = sts[2:end]
@btime assemble_sparse_CSC(N,row_idcs,col_idcs)
@btime assemble_sparse_COO(N,row_idcs,col_idcs)
end
201.066 ms (7 allocations: 1.08 KiB)
515.448 ms (55 allocations: 440.49 MiB)
sts = timeseries_to_grid(orbit, 1000)
begin
N = 1000
n = length(sts)
row_idcs = sts[1:end-1]
col_idcs = sts[2:end]
@btime assemble_sparse_CSC(N,row_idcs,col_idcs)
@btime assemble_sparse_COO(N,row_idcs,col_idcs)
end
262.285 ms (15 allocations: 115.84 KiB)
486.418 ms (55 allocations: 440.49 MiB)
We should use the
COO
sparse format when we construct a sparse matrix in several steps (add each transition at a time). In theCOO
format (seeSparseMatrixCOO
from LuxurySparse.jl ) this is more efficient than in the regularCSC
format used by SparseArrays.jl. However, theCSC
format is more efficient and more suitable for linear algebra computations, so once the matrix is constructed inCOO
, it should be converted toCSC
.Already implemented in:
transitions.jl
(used for addition of symbolic timeseries)Yet to be implemented in:
transition_matrix.jl