dpeerlab / SEACells

SEACells algorithm for Inference of transcriptional and epigenomic cellular states from single-cell genomics data
GNU General Public License v2.0
145 stars 27 forks source link

ValueError: row, column, and data array must all be the same length when running model.fit #37

Closed tatyana-perlova closed 1 year ago

tatyana-perlova commented 1 year ago

I get the following error when running model.fit unless I set model.k = len(model.archetypes):

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[38], line 1
----> 1 model.fit(min_iter=1, max_iter=1)

File ~/anaconda3/envs/scanpy_env/lib/python3.8/site-packages/SEACells-0.3.3-py3.8.egg/SEACells/cpu.py:608, in SEACellsCPU.fit(self, max_iter, min_iter, initial_archetypes, initial_assignments)
    605 if max_iter < min_iter:
    606     raise ValueError(
    607         "The maximum number of iterations specified is lower than the minimum number of iterations specified.")
--> 608 self._fit(max_iter=max_iter, min_iter=min_iter, initial_archetypes=initial_archetypes,
    609           initial_assignments=initial_assignments)

File ~/anaconda3/envs/scanpy_env/lib/python3.8/site-packages/SEACells-0.3.3-py3.8.egg/SEACells/cpu.py:562, in SEACellsCPU._fit(self, max_iter, min_iter, initial_archetypes, initial_assignments)
    550 def _fit(self, max_iter: int = 50, min_iter: int = 10, initial_archetypes=None, initial_assignments=None):
    551     """
    552     Internal method to compute archetypes and loadings given kernel matrix K.
    553     Iteratively updates A and B matrices until maximum number of iterations or convergence has been achieved.
   (...)
    560     :return: None
    561     """
--> 562     self.initialize(initial_archetypes=initial_archetypes, initial_assignments=initial_assignments)
    564     converged = False
    565     n_iter = 0

File ~/anaconda3/envs/scanpy_env/lib/python3.8/site-packages/SEACells-0.3.3-py3.8.egg/SEACells/cpu.py:220, in SEACellsCPU.initialize(self, initial_archetypes, initial_assignments)
    218 rows = self.archetypes
    219 shape = (n, k)
--> 220 B0 = csr_matrix((np.ones(len(rows)), (rows, cols)), shape=shape)
    222 self.B0 = B0
    223 B = self.B0.copy()

File ~/anaconda3/envs/scanpy_env/lib/python3.8/site-packages/scipy/sparse/_compressed.py:53, in _cs_matrix.__init__(self, arg1, shape, dtype, copy)
     49 else:
     50     if len(arg1) == 2:
     51         # (data, ij) format
     52         other = self.__class__(
---> 53             self._coo_container(arg1, shape=shape, dtype=dtype)
     54         )
     55         self._set_self(other)
     56     elif len(arg1) == 3:
     57         # (data, indices, indptr) format

File ~/anaconda3/envs/scanpy_env/lib/python3.8/site-packages/scipy/sparse/_coo.py:196, in coo_matrix.__init__(self, arg1, shape, dtype, copy)
    193 if dtype is not None:
    194     self.data = self.data.astype(dtype, copy=False)
--> 196 self._check()

File ~/anaconda3/envs/scanpy_env/lib/python3.8/site-packages/scipy/sparse/_coo.py:281, in coo_matrix._check(self)
    278 self.col = np.asarray(self.col, dtype=idx_dtype)
    279 self.data = to_native(self.data)
--> 281 if self.nnz > 0:
    282     if self.row.max() >= self.shape[0]:
    283         raise ValueError('row index exceeds matrix dimensions')

File ~/anaconda3/envs/scanpy_env/lib/python3.8/site-packages/scipy/sparse/_base.py:299, in spmatrix.nnz(self)
    291 @property
    292 def nnz(self):
    293     """Number of stored values, including explicit zeros.
    294 
    295     See also
    296     --------
    297     count_nonzero : Number of non-zero entries
    298     """
--> 299     return self.getnnz()

File ~/anaconda3/envs/scanpy_env/lib/python3.8/site-packages/scipy/sparse/_coo.py:243, in coo_matrix.getnnz(self, axis)
    241 nnz = len(self.data)
    242 if nnz != len(self.row) or nnz != len(self.col):
--> 243     raise ValueError('row, column, and data array must all be the '
    244                      'same length')
    246 if self.data.ndim != 1 or self.row.ndim != 1 or \
    247         self.col.ndim != 1:
    248     raise ValueError('row, column, and data arrays must be 1-D')

ValueError: row, column, and data array must all be the same length
sitarapersad commented 1 year ago

This is strange, I can't replicate it with my data. Hmm, model.k is set to the number of SEACells, which should match the length of the archetypes. Can you print out what model.k is before you set it to len(archetypes) and let me know?

tatyana-perlova commented 1 year ago

image

sitarapersad commented 1 year ago

Thanks! This should be fixed now :)