theislab / dca

Deep count autoencoder for denoising scRNA-seq data
Apache License 2.0
247 stars 71 forks source link

Unable to run dca with mode='denoise'. 'ValueError: setting an array element with a sequence.' #29

Open shuyizzz opened 5 years ago

shuyizzz commented 5 years ago

Hello,

Thank you for developing this package! I am able to run it with mode='latent', but not with mode='denoise'. I have tried installing different versions of numpy and tensorflow, but nothing seems to work. I am using the following code.

import numpy as np
import pandas as pd
import scanpy as sc
from dca.api import dca

adata = sc.read_10x_mtx(
    './data/filtered_gene_bc_matrices/hg19/', #following example in scanpy tutorial
    var_names='gene_symbols',               
    cache=True)     

sc.pp.filter_genes(adata, min_counts=1)
dca(adata, mode='denoise', return_info=True)

The error message includes the following.

//anaconda3/envs/scenv/lib/python3.6/site-packages/dca/api.py in dca(adata, mode, ae_type, normalize_per_cell, scale, log1p, hidden_size, hidden_dropout, batchnorm, activation, init, network_kwds, epochs, reduce_lr, early_stop, batch_size, optimizer, learning_rate, random_state, threads, verbose, training_kwds, return_model, return_info, copy) 193 194 hist = train(adata[adata.obs.dca_split == 'train'], net, **training_kwds) --> 195 res = net.predict(adata, mode, return_info, copy) 196 adata = res if copy else adata 197

//anaconda3/envs/scenv/lib/python3.6/site-packages/dca/network.py in predict(self, adata, mode, return_info, copy, colnames) 402 name='mean')(self.decoder_output) 403 output = ColwiseMultLayer([mean, self.sf_layer]) --> 404 output = SliceLayer(0, name='slice')([output, disp, pi]) 405 406 zinb = ZINB(pi, theta=disp, ridge_lambda=self.ridge, debug=self.debug)

//anaconda3/envs/scenv/lib/python3.6/site-packages/dca/network.py in predict(self, adata, mode, return_info, copy) 200 adata.uns['dca_loss'] = self.model.test_on_batch({'count': adata.X, 201 'size_factors': adata.obs.size_factors}, --> 202 adata.raw.X) 203 204 if mode in ('latent', 'full'):

//anaconda3/envs/scenv/lib/python3.6/site-packages/keras/engine/training.py in test_on_batch(self, x, y, sample_weight) 1486 ins = x + y + sample_weights 1487 self._make_test_function() -> 1488 outputs = self.test_function(ins) 1489 return unpack_singleton(outputs) 1490

//anaconda3/envs/scenv/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py in call(self, inputs) 2977 return self._legacy_call(inputs) 2978 -> 2979 return self._call(inputs) 2980 else: 2981 if py_any(is_tensor(x) for x in inputs):

//anaconda3/envs/scenv/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py in _call(self, inputs) 2915 array_vals.append( 2916 np.asarray(value, -> 2917 dtype=tf.as_dtype(tensor.dtype).as_numpy_dtype)) 2918 if self.feed_dict: 2919 for key in sorted(self.feed_dict.keys()):

//anaconda3/envs/scenv/lib/python3.6/site-packages/numpy/core/_asarray.py in asarray(a, dtype, order)

ValueError: setting an array element with a sequence.

This is the problem I have with mode='denoise'. I could run dca with mode='latent', but I would need the mean output, which I could not find in the anndata object even if return_info=True.

Thank you!

fangfang0906 commented 4 years ago

Hi, I ran into the same issue and found the problem. The adata.X is sparse matrix in our case. It needs to be converted to numpy array via adata.X = adata.X.toarray() before running DCA with mode "denoise". The sparse matrix works only with mode "latent".

afrendeiro commented 4 years ago

@fangfang0906 I experience the same: the problem seems specific to sparse input.