Converting sparse/dense matrix objects to COO-formatted data frames is now handled by a new more efficient internal utility, matrix_to_coo(), which replaces the old utility, dgtmatrix_to_dataframe().
The new utility accepts any matrix-like object coercible to a TsparseMatrix and uses Matrix::mat2triplet() to perform the conversion. Matrix dimension labels are now stored as factors in the COO data frame's index columns to avoid fully materializing the character vectors. Because the index columns already contain integers that map to the original dimension labels, we manually create the factor vectors to avoid overhead imposed by the factor()/as.factor() constructors.
Converting a 50k × 1k sparse matrix with string dimensions and 60% density is about ~8× faster with matrix_to_coo() compared with dgtmatrix_to_dataframe(), and requires about 1/5 as much memory:
Converting sparse/dense matrix objects to COO-formatted data frames is now handled by a new more efficient internal utility,
matrix_to_coo()
, which replaces the old utility,dgtmatrix_to_dataframe()
.The new utility accepts any matrix-like object coercible to a
TsparseMatrix
and usesMatrix::mat2triplet()
to perform the conversion. Matrix dimension labels are now stored as factors in the COO data frame's index columns to avoid fully materializing the character vectors. Because the index columns already contain integers that map to the original dimension labels, we manually create the factor vectors to avoid overhead imposed by thefactor()
/as.factor()
constructors.Converting a 50k × 1k sparse matrix with string dimensions and 60% density is about ~8× faster with
matrix_to_coo()
compared withdgtmatrix_to_dataframe()
, and requires about 1/5 as much memory: