scverse / scanpy

Single-cell analysis in Python. Scales to >1M cells.
https://scanpy.readthedocs.io
BSD 3-Clause "New" or "Revised" License
1.87k stars 595 forks source link

sc.pp.calculate_qc_metrics Runtime Error #1147

Open lyciansarpedon opened 4 years ago

lyciansarpedon commented 4 years ago

(Python & GitHub novice here, apologies in advance.)

Running through a tutorial using the 10xGenomics 3K PBMC dataset in Jupyter Notebook on Windows 10, caught an error at sc.pp.calculate_qc_metrics. Based on a quick look with my untrained eyes, this may not be a scanpy issue per se so much as an underlying data structure conflict issue in numba and/or llvmlite?

Trimmed down code I used to reach that point (the skipped steps, in ellipses, don't seem to be necessary, but I may still have a few extras there):

import scanpy as sc
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

...

adata = sc.read_10x_mtx("/PBMC_10X/")

...

adata_10x = sc.read_10x_mtx("/PBMC_10X/")

...

sc.pp.calculate_qc_metrics(adata_10x, inplace = True)

That spat out:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
~\anaconda3\lib\site-packages\numba\errors.py in new_error_context(fmt_, *args, **kwargs)
    716     try:
--> 717         yield
    718     except NumbaError as e:

~\anaconda3\lib\site-packages\numba\lowering.py in lower_block(self, block)
    287                                    loc=self.loc, errcls_=defaulterrcls):
--> 288                 self.lower_inst(inst)
    289         self.post_block(block)

~\anaconda3\lib\site-packages\numba\lowering.py in lower_inst(self, inst)
    475                 if isinstance(inst, _class):
--> 476                     func(self, inst)
    477                     return

~\anaconda3\lib\site-packages\numba\npyufunc\parfor.py in _lower_parfor_parallel(lowerer, parfor)
    240         lowerer, parfor, typemap, typingctx, targetctx, flags, {},
--> 241         bool(alias_map), index_var_typ, parfor.races)
    242     numba.parfor.sequential_parfor_lowering = False

~\anaconda3\lib\site-packages\numba\npyufunc\parfor.py in _create_gufunc_for_parfor_body(lowerer, parfor, typemap, typingctx, targetctx, flags, locals, has_aliases, index_var_typ, races)
   1168         flags,
-> 1169         locals)
   1170 

~\anaconda3\lib\site-packages\numba\compiler.py in compile_ir(typingctx, targetctx, func_ir, args, return_type, flags, locals, lifted, lifted_from, is_lifted_loop, library, pipeline_class)
    614         return pipeline.compile_ir(func_ir=func_ir, lifted=lifted,
--> 615                                    lifted_from=lifted_from)
    616 

~\anaconda3\lib\site-packages\numba\compiler.py in compile_ir(self, func_ir, lifted, lifted_from)
    340         FixupArgs().run_pass(self.state)
--> 341         return self._compile_ir()
    342 

~\anaconda3\lib\site-packages\numba\compiler.py in _compile_ir(self)
    399         assert self.state.func_ir is not None
--> 400         return self._compile_core()
    401 

~\anaconda3\lib\site-packages\numba\compiler.py in _compile_core(self)
    372                 if is_final_pipeline:
--> 373                     raise e
    374         else:

~\anaconda3\lib\site-packages\numba\compiler.py in _compile_core(self)
    363             try:
--> 364                 pm.run(self.state)
    365                 if self.state.cr is not None:

~\anaconda3\lib\site-packages\numba\compiler_machinery.py in run(self, state)
    346                 patched_exception = self._patch_error(msg, e)
--> 347                 raise patched_exception
    348 

~\anaconda3\lib\site-packages\numba\compiler_machinery.py in run(self, state)
    337                 if isinstance(pass_inst, CompilerPass):
--> 338                     self._runPass(idx, pass_inst, state)
    339                 else:

~\anaconda3\lib\site-packages\numba\compiler_lock.py in _acquire_compile_lock(*args, **kwargs)
     31             with self:
---> 32                 return func(*args, **kwargs)
     33         return _acquire_compile_lock

~\anaconda3\lib\site-packages\numba\compiler_machinery.py in _runPass(self, index, pss, internal_state)
    301         with SimpleTimer() as pass_time:
--> 302             mutated |= check(pss.run_pass, internal_state)
    303         with SimpleTimer() as finalize_time:

~\anaconda3\lib\site-packages\numba\compiler_machinery.py in check(func, compiler_state)
    274         def check(func, compiler_state):
--> 275             mangled = func(compiler_state)
    276             if mangled not in (True, False):

~\anaconda3\lib\site-packages\numba\typed_passes.py in run_pass(self, state)
    406         # TODO: Pull this out into the pipeline
--> 407         NativeLowering().run_pass(state)
    408         lowered = state['cr']

~\anaconda3\lib\site-packages\numba\typed_passes.py in run_pass(self, state)
    348                                        metadata=metadata)
--> 349                 lower.lower()
    350                 if not flags.no_cpython_wrapper:

~\anaconda3\lib\site-packages\numba\lowering.py in lower(self)
    231         # Materialize LLVM Module
--> 232         self.library.add_ir_module(self.module)
    233 

~\anaconda3\lib\site-packages\numba\targets\codegen.py in add_ir_module(self, ir_module)
    200         ir = cgutils.normalize_ir_text(str(ir_module))
--> 201         ll_module = ll.parse_assembly(ir)
    202         ll_module.name = ir_module.name

~\anaconda3\lib\site-packages\llvmlite\binding\module.py in parse_assembly(llvmir, context)
     25             mod.close()
---> 26             raise RuntimeError("LLVM IR parsing error\n{0}".format(errmsg))
     27     return mod

RuntimeError: Failed in nopython mode pipeline (step: nopython mode backend)
LLVM IR parsing error
<string>:4053:36: error: '%.2725' defined with type 'i64' but expected 'i32'
  %".2726" = icmp eq i32 %".2724", %".2725"
                                   ^

During handling of the above exception, another exception occurred:

LoweringError                             Traceback (most recent call last)
<ipython-input-21-b19e785cf655> in <module>
----> 1 sc.pp.calculate_qc_metrics(adata_10x, inplace = True)

~\anaconda3\lib\site-packages\scanpy\preprocessing\_qc.py in calculate_qc_metrics(adata, expr_type, var_type, qc_vars, percent_top, layer, use_raw, inplace, parallel)
    281         percent_top=percent_top,
    282         inplace=inplace,
--> 283         X=X,
    284     )
    285     var_metrics = describe_var(

~\anaconda3\lib\site-packages\scanpy\preprocessing\_qc.py in describe_obs(adata, expr_type, var_type, qc_vars, percent_top, layer, use_raw, inplace, X, parallel)
    107     if percent_top:
    108         percent_top = sorted(percent_top)
--> 109         proportions = top_segment_proportions(X, percent_top)
    110         for i, n in enumerate(percent_top):
    111             obs_metrics[f"pct_{expr_type}_in_top_{n}_{var_type}"] = (

~\anaconda3\lib\site-packages\scanpy\preprocessing\_qc.py in top_segment_proportions(mtx, ns)
    364             mtx = csr_matrix(mtx)
    365         return top_segment_proportions_sparse_csr(
--> 366             mtx.data, mtx.indptr, np.array(ns, dtype=np.int)
    367         )
    368     else:

~\anaconda3\lib\site-packages\numba\dispatcher.py in _compile_for_args(self, *args, **kws)
    418                     e.patch_message('\n'.join((str(e).rstrip(), help_msg)))
    419             # ignore the FULL_TRACEBACKS config, this needs reporting!
--> 420             raise e
    421 
    422     def inspect_llvm(self, signature=None):

~\anaconda3\lib\site-packages\numba\dispatcher.py in _compile_for_args(self, *args, **kws)
    351                 argtypes.append(self.typeof_pyval(a))
    352         try:
--> 353             return self.compile(tuple(argtypes))
    354         except errors.ForceLiteralArg as e:
    355             # Received request for compiler re-entry with the list of arguments

~\anaconda3\lib\site-packages\numba\compiler_lock.py in _acquire_compile_lock(*args, **kwargs)
     30         def _acquire_compile_lock(*args, **kwargs):
     31             with self:
---> 32                 return func(*args, **kwargs)
     33         return _acquire_compile_lock
     34 

~\anaconda3\lib\site-packages\numba\dispatcher.py in compile(self, sig)
    766             self._cache_misses[sig] += 1
    767             try:
--> 768                 cres = self._compiler.compile(args, return_type)
    769             except errors.ForceLiteralArg as e:
    770                 def folded(args, kws):

~\anaconda3\lib\site-packages\numba\dispatcher.py in compile(self, args, return_type)
     75 
     76     def compile(self, args, return_type):
---> 77         status, retval = self._compile_cached(args, return_type)
     78         if status:
     79             return retval

~\anaconda3\lib\site-packages\numba\dispatcher.py in _compile_cached(self, args, return_type)
     89 
     90         try:
---> 91             retval = self._compile_core(args, return_type)
     92         except errors.TypingError as e:
     93             self._failed_cache[key] = e

~\anaconda3\lib\site-packages\numba\dispatcher.py in _compile_core(self, args, return_type)
    107                                       args=args, return_type=return_type,
    108                                       flags=flags, locals=self.locals,
--> 109                                       pipeline_class=self.pipeline_class)
    110         # Check typing error if object mode is used
    111         if cres.typing_error is not None and not flags.enable_pyobject:

~\anaconda3\lib\site-packages\numba\compiler.py in compile_extra(typingctx, targetctx, func, args, return_type, flags, locals, library, pipeline_class)
    549     pipeline = pipeline_class(typingctx, targetctx, library,
    550                               args, return_type, flags, locals)
--> 551     return pipeline.compile_extra(func)
    552 
    553 

~\anaconda3\lib\site-packages\numba\compiler.py in compile_extra(self, func)
    329         self.state.lifted = ()
    330         self.state.lifted_from = None
--> 331         return self._compile_bytecode()
    332 
    333     def compile_ir(self, func_ir, lifted=(), lifted_from=None):

~\anaconda3\lib\site-packages\numba\compiler.py in _compile_bytecode(self)
    391         """
    392         assert self.state.func_ir is None
--> 393         return self._compile_core()
    394 
    395     def _compile_ir(self):

~\anaconda3\lib\site-packages\numba\compiler.py in _compile_core(self)
    371                 self.state.status.fail_reason = e
    372                 if is_final_pipeline:
--> 373                     raise e
    374         else:
    375             raise CompilerError("All available pipelines exhausted")

~\anaconda3\lib\site-packages\numba\compiler.py in _compile_core(self)
    362             res = None
    363             try:
--> 364                 pm.run(self.state)
    365                 if self.state.cr is not None:
    366                     break

~\anaconda3\lib\site-packages\numba\compiler_machinery.py in run(self, state)
    345                     (self.pipeline_name, pass_desc)
    346                 patched_exception = self._patch_error(msg, e)
--> 347                 raise patched_exception
    348 
    349     def dependency_analysis(self):

~\anaconda3\lib\site-packages\numba\compiler_machinery.py in run(self, state)
    336                 pass_inst = _pass_registry.get(pss).pass_inst
    337                 if isinstance(pass_inst, CompilerPass):
--> 338                     self._runPass(idx, pass_inst, state)
    339                 else:
    340                     raise BaseException("Legacy pass in use")

~\anaconda3\lib\site-packages\numba\compiler_lock.py in _acquire_compile_lock(*args, **kwargs)
     30         def _acquire_compile_lock(*args, **kwargs):
     31             with self:
---> 32                 return func(*args, **kwargs)
     33         return _acquire_compile_lock
     34 

~\anaconda3\lib\site-packages\numba\compiler_machinery.py in _runPass(self, index, pss, internal_state)
    300             mutated |= check(pss.run_initialization, internal_state)
    301         with SimpleTimer() as pass_time:
--> 302             mutated |= check(pss.run_pass, internal_state)
    303         with SimpleTimer() as finalize_time:
    304             mutated |= check(pss.run_finalizer, internal_state)

~\anaconda3\lib\site-packages\numba\compiler_machinery.py in check(func, compiler_state)
    273 
    274         def check(func, compiler_state):
--> 275             mangled = func(compiler_state)
    276             if mangled not in (True, False):
    277                 msg = ("CompilerPass implementations should return True/False. "

~\anaconda3\lib\site-packages\numba\typed_passes.py in run_pass(self, state)
    405 
    406         # TODO: Pull this out into the pipeline
--> 407         NativeLowering().run_pass(state)
    408         lowered = state['cr']
    409         signature = typing.signature(state.return_type, *state.args)

~\anaconda3\lib\site-packages\numba\typed_passes.py in run_pass(self, state)
    347                 lower = lowering.Lower(targetctx, library, fndesc, interp,
    348                                        metadata=metadata)
--> 349                 lower.lower()
    350                 if not flags.no_cpython_wrapper:
    351                     lower.create_cpython_wrapper(flags.release_gil)

~\anaconda3\lib\site-packages\numba\lowering.py in lower(self)
    193         if self.generator_info is None:
    194             self.genlower = None
--> 195             self.lower_normal_function(self.fndesc)
    196         else:
    197             self.genlower = self.GeneratorLower(self)

~\anaconda3\lib\site-packages\numba\lowering.py in lower_normal_function(self, fndesc)
    246         # Init argument values
    247         self.extract_function_arguments()
--> 248         entry_block_tail = self.lower_function_body()
    249 
    250         # Close tail of entry block

~\anaconda3\lib\site-packages\numba\lowering.py in lower_function_body(self)
    271             bb = self.blkmap[offset]
    272             self.builder.position_at_end(bb)
--> 273             self.lower_block(block)
    274 
    275         self.post_lower()

~\anaconda3\lib\site-packages\numba\lowering.py in lower_block(self, block)
    286             with new_error_context('lowering "{inst}" at {loc}', inst=inst,
    287                                    loc=self.loc, errcls_=defaulterrcls):
--> 288                 self.lower_inst(inst)
    289         self.post_block(block)
    290 

~\anaconda3\lib\contextlib.py in __exit__(self, type, value, traceback)
    128                 value = type()
    129             try:
--> 130                 self.gen.throw(type, value, traceback)
    131             except StopIteration as exc:
    132                 # Suppress StopIteration *unless* it's the same exception that

~\anaconda3\lib\site-packages\numba\errors.py in new_error_context(fmt_, *args, **kwargs)
    723         from numba import config
    724         tb = sys.exc_info()[2] if config.FULL_TRACEBACKS else None
--> 725         six.reraise(type(newerr), newerr, tb)
    726 
    727 

~\anaconda3\lib\site-packages\numba\six.py in reraise(tp, value, tb)
    667         if value.__traceback__ is not tb:
    668             raise value.with_traceback(tb)
--> 669         raise value
    670 
    671 else:

LoweringError: Failed in nopython mode pipeline (step: nopython mode backend)
Failed in nopython mode pipeline (step: nopython mode backend)
LLVM IR parsing error
<string>:4053:36: error: '%.2725' defined with type 'i64' but expected 'i32'
  %".2726" = icmp eq i32 %".2724", %".2725"
                                   ^

File "..\..\anaconda3\lib\site-packages\scanpy\preprocessing\_qc.py", line 399:
def top_segment_proportions_sparse_csr(data, indptr, ns):
    <source elided>
    partitioned = np.zeros((indptr.size - 1, maxidx), dtype=data.dtype)
    for i in numba.prange(indptr.size - 1):
    ^

[1] During: lowering "id=13[LoopNest(index_variable = parfor_index.260, range = (0, $122binary_subtract.5, 1))]{130: <ir.Block at C:\Users\lyciansarpedon\anaconda3\lib\site-packages\scanpy\preprocessing\_qc.py (399)>, 400: <ir.Block at C:\Users\lyciansarpedon\anaconda3\lib\site-packages\scanpy\preprocessing\_qc.py (405)>, 402: <ir.Block at C:\Users\lyciansarpedon\anaconda3\lib\site-packages\scanpy\preprocessing\_qc.py (406)>, 276: <ir.Block at C:\Users\lyciansarpedon\anaconda3\lib\site-packages\scanpy\preprocessing\_qc.py (403)>, 318: <ir.Block at C:\Users\lyciansarpedon\anaconda3\lib\site-packages\scanpy\preprocessing\_qc.py (404)>}Var(parfor_index.260, _qc.py:399)" at C:\Users\lyciansarpedon\anaconda3\lib\site-packages\scanpy\preprocessing\_qc.py (399)

Unlikely to be related, but this was after I had issues installing scanpy from conda (as in #1142), which I got around by installing through pip.

Versions:

scanpy==1.4.6 anndata==0.7.1 umap==0.3.10 numpy==1.18.1 scipy==1.4.1 pandas==1.0.1 scikit-learn==0.22.1 statsmodels==0.11.0

giovp commented 4 years ago

Seen this recently exactly on a windows laptop. Not sure but sound like something messed up with the environment, are you working on the base env? Try creating a fresh conda environment and installing scanpy there.

lyciansarpedon commented 4 years ago

Seen this recently exactly on a windows laptop. Not sure but sound like something messed up with the environment, are you working on the base env? Try creating a fresh conda environment and installing scanpy there.

Thanks for the suggestion, @giovp. Was doing it in the base env earlier. Made a new env and tried it again, but ran into the same exact error.

ivirshup commented 4 years ago

What's your version of numba in this environment?

lyciansarpedon commented 4 years ago

What's your version of numba in this environment?

It's version 0.48.0, @ivirshup. For what it's worth, the build is py38he350917_0 and the source is conda-forge. (In the base environment where I was getting the same error, the version is the same but the build is py37h47e9c7a_0.)

ivirshup commented 4 years ago

Could you show the result of print(adata_10x.X)?


I'm not sure how much I can help with this, since I don't have a windows machine to try this on. I think this might be best raised an issue over on the numba repository. @giovp, maybe you could look into more about how came across this?

lyciansarpedon commented 4 years ago

Here's what I got, @ivirshup:

print(adata_10x.X)

(0, 70) 1.0
  (0, 166)  1.0
  (0, 178)  2.0
  (0, 326)  1.0
  (0, 363)  1.0
  (0, 410)  1.0
  (0, 412)  1.0
  (0, 492)  41.0
  (0, 494)  1.0
  (0, 495)  1.0
  (0, 496)  1.0
  (0, 525)  1.0
  (0, 556)  2.0
  (0, 558)  6.0
  (0, 671)  1.0
  (0, 684)  1.0
  (0, 735)  1.0
  (0, 770)  1.0
  (0, 793)  1.0
  (0, 820)  1.0
  (0, 859)  2.0
  (0, 871)  1.0
  (0, 908)  15.0
  (0, 926)  1.0
  (0, 941)  1.0
  : :
  (2699, 31849) 1.0
  (2699, 31855) 1.0
  (2699, 31887) 1.0
  (2699, 31949) 2.0
  (2699, 31970) 2.0
  (2699, 32022) 17.0
  (2699, 32044) 1.0
  (2699, 32047) 2.0
  (2699, 32059) 1.0
  (2699, 32065) 1.0
  (2699, 32066) 1.0
  (2699, 32082) 1.0
  (2699, 32186) 1.0
  (2699, 32193) 1.0
  (2699, 32322) 1.0
  (2699, 32442) 1.0
  (2699, 32543) 1.0
  (2699, 32581) 1.0
  (2699, 32641) 1.0
  (2699, 32696) 3.0
  (2699, 32697) 1.0
  (2699, 32698) 7.0
  (2699, 32702) 1.0
  (2699, 32705) 1.0
  (2699, 32708) 3.0

I took a look over at the numba repository, any chance this is related to https://github.com/numba/numba/issues/4529 (i.e. #843 , which I somehow missed when I was looking through issues in this repository before I posted)?

ivirshup commented 4 years ago

Ah, sorry, looks like I meant to say print(repr(adata.X)).

I think the specific bug in that previous issue was fixed, but this seems kinda related in that it's windows and 32 bit numbers.

ivirshup commented 4 years ago

Overall, I think you should open an issue on numba for this. Please tag me if you do!