theislab / single-cell-tutorial

Single cell current best practices tutorial case study for the paper:Luecken and Theis, "Current best practices in single-cell RNA-seq analysis: a tutorial"
1.36k stars 453 forks source link

ComBat Error #29

Closed Zifeng-L closed 4 years ago

Zifeng-L commented 4 years ago

Hi, here. After learning, I tried to use my own data to analysis. Everything went well before the batch correction. And I couldn't understand what happened and how to debug. Can anyone help me? Thx! `

ComBat batch correction

sc.pp.combat(adata, key='sample')

Standardizing Data across genes.


AssertionError Traceback (most recent call last)

in () 1 # ComBat batch correction ----> 2 sc.pp.combat(adata, key='sample') ~/anaconda2/envs/cnmf_env/lib/python3.6/site-packages/scanpy/preprocessing/_combat.py in combat(adata, key, covariates, inplace) 210 # standardize across genes using a pooled variance estimator 211 logg.info("Standardizing Data across genes.\n") --> 212 s_data, design, var_pooled, stand_mean = _standardize_data(model, data, key) 213 214 # fitting the parameters on the standardized data ~/anaconda2/envs/cnmf_env/lib/python3.6/site-packages/scanpy/preprocessing/_combat.py in _standardize_data(model, data, batch_key) 103 n_array = float(sum(n_batches)) 104 --> 105 design = _design_matrix(model, batch_key, batch_levels) 106 107 # compute pooled variance estimator ~/anaconda2/envs/cnmf_env/lib/python3.6/site-packages/scanpy/preprocessing/_combat.py in _design_matrix(model, batch_key, batch_levels) 37 "~ 0 + C(Q('{}'), levels=batch_levels)".format(batch_key), 38 model, ---> 39 return_type="dataframe", 40 ) 41 model = model.drop([batch_key], axis=1) ~/anaconda2/envs/cnmf_env/lib/python3.6/site-packages/patsy/highlevel.py in dmatrix(formula_like, data, eval_env, NA_action, return_type) 289 eval_env = EvalEnvironment.capture(eval_env, reference=1) 290 (lhs, rhs) = _do_highlevel_design(formula_like, data, eval_env, --> 291 NA_action, return_type) 292 if lhs.shape[1] != 0: 293 raise PatsyError("encountered outcome variables for a model " ~/anaconda2/envs/cnmf_env/lib/python3.6/site-packages/patsy/highlevel.py in _do_highlevel_design(formula_like, data, eval_env, NA_action, return_type) 163 return iter([data]) 164 design_infos = _try_incr_builders(formula_like, data_iter_maker, eval_env, --> 165 NA_action) 166 if design_infos is not None: 167 return build_design_matrices(design_infos, data, ~/anaconda2/envs/cnmf_env/lib/python3.6/site-packages/patsy/highlevel.py in _try_incr_builders(formula_like, data_iter_maker, eval_env, NA_action) 60 "ascii-only, or else upgrade to Python 3.") 61 if isinstance(formula_like, str): ---> 62 formula_like = ModelDesc.from_formula(formula_like) 63 # fallthrough 64 if isinstance(formula_like, ModelDesc): ~/anaconda2/envs/cnmf_env/lib/python3.6/site-packages/patsy/desc.py in from_formula(cls, tree_or_string) 162 tree = tree_or_string 163 else: --> 164 tree = parse_formula(tree_or_string) 165 value = Evaluator().eval(tree, require_evalexpr=False) 166 assert isinstance(value, cls) ~/anaconda2/envs/cnmf_env/lib/python3.6/site-packages/patsy/parse_formula.py in parse_formula(code, extra_operators) 146 tree = infix_parse(_tokenize_formula(code, operator_strings), 147 operators, --> 148 _atomic_token_types) 149 if not isinstance(tree, ParseNode) or tree.type != "~": 150 tree = ParseNode("~", None, [tree], tree.origin) ~/anaconda2/envs/cnmf_env/lib/python3.6/site-packages/patsy/infix_parser.py in infix_parse(tokens, operators, atomic_types, trace) 208 209 want_noun = True --> 210 for token in token_source: 211 if c.trace: 212 print("Reading next token (want_noun=%r)" % (want_noun,)) ~/anaconda2/envs/cnmf_env/lib/python3.6/site-packages/patsy/parse_formula.py in _tokenize_formula(code, operator_strings) 92 else: 93 it.push_back((pytype, token_string, origin)) ---> 94 yield _read_python_expr(it, end_tokens) 95 96 def test__tokenize_formula(): ~/anaconda2/envs/cnmf_env/lib/python3.6/site-packages/patsy/parse_formula.py in _read_python_expr(it, end_tokens) 42 origins = [] 43 bracket_level = 0 ---> 44 for pytype, token_string, origin in it: 45 assert bracket_level >= 0 46 if bracket_level == 0 and token_string in end_tokens: ~/anaconda2/envs/cnmf_env/lib/python3.6/site-packages/patsy/util.py in next(self) 319 else: 320 # May raise StopIteration --> 321 return six.advance_iterator(self._it) 322 __next__ = next 323 ~/anaconda2/envs/cnmf_env/lib/python3.6/site-packages/patsy/tokens.py in python_tokenize(code) 33 break 34 origin = Origin(code, start, end) ---> 35 assert pytype not in (tokenize.NL, tokenize.NEWLINE) 36 if pytype == tokenize.ERRORTOKEN: 37 raise PatsyError("error tokenizing input " AssertionError: `
LuckyMD commented 4 years ago

Hi,

Does you anndata object have the obs column 'sample'? Could you show the output of just writing adata in your console?

Zifeng-L commented 4 years ago

你好

您的andata对象是否具有该obs'sample'?您能否显示仅adata在控制台中编写的输出?

AnnData object with n_obs × n_vars = 23120 × 25905 obs: 'sample', 'sample_id', 'n_counts', 'log_counts', 'n_genes', 'mt_frac', 'size_factors' var: 'gene_id', 'n_cells' uns: 'sample_colors', 'log1p' layers: 'counts'

LuckyMD commented 4 years ago

Could you try running adata.obs['sample'] = adata.obs['sample'].astype('category') and then running again?

Zifeng-L commented 4 years ago

adata.obs['sample'] = adata.obs['sample'].astype('category')

I tried recently, unluckily, the error was still exist.

LuckyMD commented 4 years ago

This is quite strange. Do your samples all cover multiple cells, or is there some samples without any cells remaining after filtering? It would be helpful if you had a minimal reproducible example for this. Can you reproduce this error with for example sc.datasets.pbmc68k_reduced()?

LuckyMD commented 4 years ago

Given no further replies, I will close this issue.