I noticed that my count matrix had random, very high values as some entries when I read in cell binned gef files with the st.io.read_gef() function. These values would change every time I read the data, so every time my count matrix was different. I believe the bug could be fixed when io.reader.py is fixed in the following way:
In read_gef() function:
line 1091:
from:
exp_matrix = csr_matrix((count, (cell_ind, gene_ind)), shape=(cell_num, gene_num), dtype=np.uint32)
to:
exp_matrix = csr_matrix((count, (cell_ind, gene_ind)), shape=(cell_num, gene_num), dtype=np.uint16)
and line 1135:
from:
exp_matrix = csr_matrix((count, indices, indptr), shape=(cell_num, gene_num), dtype=np.uint32)
to:
exp_matrix = csr_matrix((count, indices, indptr), shape=(cell_num, gene_num), dtype=np.uint16)
Thanks for your feedback, we had noticed the error you mentioned, in fact, it is caused by our another package gefpy, the data type returned from gefpy is incorrect, we will fix it at next version.
I noticed that my count matrix had random, very high values as some entries when I read in cell binned gef files with the st.io.read_gef() function. These values would change every time I read the data, so every time my count matrix was different. I believe the bug could be fixed when io.reader.py is fixed in the following way:
In read_gef() function: line 1091: from: exp_matrix = csr_matrix((count, (cell_ind, gene_ind)), shape=(cell_num, gene_num), dtype=np.uint32) to: exp_matrix = csr_matrix((count, (cell_ind, gene_ind)), shape=(cell_num, gene_num), dtype=np.uint16)
and line 1135: from: exp_matrix = csr_matrix((count, indices, indptr), shape=(cell_num, gene_num), dtype=np.uint32) to: exp_matrix = csr_matrix((count, indices, indptr), shape=(cell_num, gene_num), dtype=np.uint16)