epi2me-labs / wf-single-cell

Other
75 stars 39 forks source link

NUMBA RuntimeError related to cachedir #124

Closed CloXD closed 4 months ago

CloXD commented 4 months ago

Operating System

Linux

Other Linux

RedHat

Workflow Version

latest

Workflow Execution

Command line (Cluster)

Other workflow execution

lsf with singularity profile

EPI2ME Version

No response

CLI command run

No response

Workflow Execution - CLI Execution Profile

None

What happened?

Runtime error:

RuntimeError: cannot cache function 'rdist': no locator available for file '/home/epi2melabs/conda/lib/python3.8/site-packages/umap/layouts.py'

Don't know why, but apparently NUMBA cachedir is either not set or the location is not accessible ( tested on lsf using singularity). I solved in process matrix ( subworkflows/process_bams.nf )

process process_matrix {
    label "singlecell"
    cpus  1
    memory "16 GB"
    publishDir "${params.out_dir}/${meta.alias}", mode: 'copy', pattern: "*{mito,umap,raw,processed}*"
    input:
        tuple val(meta), val(feature), path('inputs/matrix*.hdf')
    output:
        tuple val(meta), val(feature), path("${feature}_raw_feature_bc_matrix"), emit: raw
        tuple val(meta), val(feature), path("${feature}_processed_feature_bc_matrix"), emit: processed
        tuple val(meta), val(feature), path("${feature}.expression.mean-per-cell.tsv"), emit: meancell
        // mito per cell makes sense only for feature=gene for now.
        tuple val(meta), val(feature), path("gene.expression.mito-per-cell.tsv"), emit: mitocell, optional: true
        tuple val(meta), val(feature), path("${feature}.expression.umap*.tsv"), emit: umap
    script:
    def mito_prefixes = params.mito_prefix.replaceAll(',', ' ')
    """
    export NUMBA_NUM_THREADS=${task.cpus}
    export NUMBA_CACHE_DIR=./tmp/  ### Added local temporary directory
    mkdir ./tmp/ ### Created temporary directory
    workflow-glue process_matrix \
        inputs/matrix*.hdf \
        --feature ${feature} \
        --raw ${feature}_raw_feature_bc_matrix \
        --processed ${feature}_processed_feature_bc_matrix \
        --per_cell_mito ${feature}.expression.mito-per-cell.tsv \
        --per_cell_expr ${feature}.expression.mean-per-cell.tsv \
        --umap_tsv ${feature}.expression.umap.tsv \
        --enable_filtering \
        --min_features $params.matrix_min_genes \
        --min_cells $params.matrix_min_cells \
        --max_mito $params.matrix_max_mito \
        --mito_prefixes $mito_prefixes \
        --norm_count $params.matrix_norm_count \
        --enable_umap \
        --replicates 3
    rm -fr ./tmp ### Removed temporary directory
    """
}

Relevant log output

Command error:
  [08:00:13 - workflow_glue] Bootstrapping CLI.
  /home/epi2melabs/conda/lib/python3.8/site-packages/umap/distances.py:1063: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.
    @numba.jit()
  /home/epi2melabs/conda/lib/python3.8/site-packages/umap/distances.py:1071: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.
    @numba.jit()
  /home/epi2melabs/conda/lib/python3.8/site-packages/umap/distances.py:1086: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.
    @numba.jit()
  Traceback (most recent call last):
    File "/home/lorenzic/.nextflow/assets/epi2me-labs/wf-single-cell/bin/workflow-glue", line 7, in <module>
      cli()
    File "/home/lorenzic/.nextflow/assets/epi2me-labs/wf-single-cell/bin/workflow_glue/__init__.py", line 66, in cli
      components = get_components(allowed_components=[sys.argv[1]])
    File "/home/lorenzic/.nextflow/assets/epi2me-labs/wf-single-cell/bin/workflow_glue/__init__.py", line 29, in get_components
      mod = importlib.import_module(f"{_package_name}.{name}")
    File "/home/epi2melabs/conda/lib/python3.8/importlib/__init__.py", line 127, in import_module
      return _bootstrap._gcd_import(name[level:], package, level)
    File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
    File "<frozen importlib._bootstrap>", line 991, in _find_and_load
    File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
    File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
    File "<frozen importlib._bootstrap_external>", line 843, in exec_module
    File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
    File "/home/lorenzic/.nextflow/assets/epi2me-labs/wf-single-cell/bin/workflow_glue/process_matrix.py", line 8, in <module>
      import umap
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/umap/__init__.py", line 2, in <module>
      from .umap_ import UMAP
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/umap/umap_.py", line 41, in <module>
      from umap.layouts import (
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/umap/layouts.py", line 40, in <module>
      def rdist(x, y):
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/numba/core/decorators.py", line 234, in wrapper
      disp.enable_caching()
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/numba/core/dispatcher.py", line 863, in enable_caching
      self._cache = FunctionCache(self.py_func)
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/numba/core/caching.py", line 601, in __init__
      self._impl = self._impl_class(py_func)
    File "/home/epi2melabs/conda/lib/python3.8/site-packages/numba/core/caching.py", line 337, in __init__
      raise RuntimeError("cannot cache function %r: no locator available "
  RuntimeError: cannot cache function 'rdist': no locator available for file '/home/epi2melabs/conda/lib/python3.8/site-packages/umap/layouts.py'

Application activity log entry

No response

Were you able to successfully run the latest version of the workflow with the demo data?

no

Other demo data information

No response

cjw85 commented 4 months ago

Have you tried the instructions in the Troubleshooting section of the README?

CloXD commented 4 months ago

Didn't see it, sorry! However, do you think it's a good idea to have --writable-tmpfs enabled? In this case, NUMBA default cache directory is in the code source directory of the library ( if I'm not misinterpreting here ) and in a containerized environment, it might be cleaner to set this NUMBA_CACHE_DIR and other cache/temporary env variable to /tmp Sorry again for the disturb and thanks for the really nice pipeline