Closed ukleiner closed 1 year ago
"Wrong layout" is still a "proper" matrix - use to_layout
if you want to ensure a specific layout.
The purpose of "to_proper" is to deal with types such as coo_matrix
and other weird stuff that doesn't even have a major axis - that is, all the near-infinite set of arbitrary strange matrix layouts that metacells can't deal with.
The result of to_proper_matrix
is a matrix in one of the small "reasonable" set of formats that metacells can deal with (dense, sparse in row or column major order).
This doesn't absolve one from worrying about the memory layout (column vs. row major layout).
Alas, given computers hardware works the way it works (and "physics"), working "against the grain" of the data (e.g. summing columns on a row-major matrix) is way slower, and it is in general more efficient to first relayout the data in the proper order before operating on it, so the code is intentionally fussy about that.
You can use allow_inefficient_layout
if you want to disable these assertions - this is highly not recommended for data of a non-trivial size.
In
mc.ut.to_proper_matrix
a scipy compressed matrix (csr_matrix, csc_matrix) won't change it's layout even if it is in the wrong sparse layout (column to row or row to column). This is because the compress matrix check happens before the sparse matrix check (and transformation). The returned matrix will be the original compressed matrix and running will stop ifmc.ut.allow_inefficient_layout(False)
is setSurfaced while running
mc.pl.relate_to_lateral_genes
with a CSC matrix. I believe checking for sparsity before compression will fix the issue.My fix for know is to manually relayout the count matrix before calling
mc.p.relate_to_lateral_genes
full.X = full.X.astype(dtype='float32').tocsr()
Thanks!