nf-core / scdownstream

A single cell transcriptomics pipeline for QC, integration and making the data presentable
https://nf-co.re/scdownstream
MIT License
36 stars 9 forks source link

scanpy GPU containers fail #105

Open KallyopeComp opened 1 day ago

KallyopeComp commented 1 day ago

Description of the bug

Hello,

This project a very exciting development for reproducible and standardized single cell analysis. However, I'm having some difficulty getting GPU acceleration to work for some of tasks.

When running the pipeline with the GPU profile on AWS batch, all of the scanpy GPU processes (SCANPY_SCRUBLET, SCANPY_HVGS, SCANPY_HARMONY, SCANPY_LEIDEN, SCANPY_NEIGHBORS) fail with the error: ImportError: /opt/conda/lib/python3.11/lib-dynload/_sqlite3.cpython-311-x86_64-linux-gnu.so: undefined symbol: sqlite3_deserialize

This looks like a problem with the docker container missing software, rather than the executor, but I don't have on-prem resources to test running locally (and if it were the container, then presumably others would have the same issue).

Other GPU tasks complete without a problem, after I modified my nextflow.config file to use a dedicated GPU compute envirnment (see attached).

Deleting the 'process_GPU' label from each of these processes allows the pipeline to run to completion (though without GPU acceleration for these processes).

Command used and terminal output

Command (run from parent directory of local clone): nextflow run scdownstream/ -profile docker,gpu -config /tmp/nextflow.config --input scdownstream/assets/samplesheet.csv --outdir --outdir {private s3 bucket}

Terminal Output:

 N E X T F L O W   ~  version 24.04.4

Launching `scdownstream/main.nf` [friendly_shirley] DSL2 - revision: 456139e594

------------------------------------------------------
                                        ,--./,-.
        ___     __   __   __   ___     /,-._.--~'
  |\ | |__  __ /  ` /  \ |__) |__         }  {
  | \| |       \__, \__/ |  \ |___     \`-._,-`-,
                                        `._,._,'
  nf-core/scdownstream 0.0.1dev
------------------------------------------------------
Input/output options
  input                     : scdownstream/assets/samplesheet.csv
  outdir                    : s3://katlas/nf_scdownstream_outs/test_run

Institutional config options
  config_profile_description: AWSBATCH Cloud Profile
  config_profile_contact    : Alexander Peltzer (@apeltzer)
  config_profile_url        : https://aws.amazon.com/batch/

Core Nextflow options
  runName                   : friendly_shirley
  containerEngine           : docker
  launchDir                 : /home/robert/repos
  workDir                   : /ktmp/nextflow-work/scrnaseq_work
  projectDir                : /home/robert/repos/scdownstream
  userName                  : robert
  profile                   : docker,gpu
  configFiles               : 

!! Only displaying parameters that differ from the pipeline defaults !!
------------------------------------------------------
* The nf-core framework
    https://doi.org/10.1038/s41587-020-0439-x

* Software dependencies
    https://github.com/nf-core/scdownstream/blob/master/CITATIONS.md

executor >  awsbatch (18)
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:SCANPY_READH5                                     -
[b1/44af75] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:ADATA_READRDS (SAMN14430801)                      [100%] 2 of 2 ✔
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:ADATA_READCSV                                     -
[12/b6da9a] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:ADATA_UNIFY (SAMN14430801)                        [100%] 4 of 4 ✔
[81/633761] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:GET_UNFILTERED_SIZE (SAMN14430801)                [100%] 2 of 2 ✔
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:EMPTY_DROPLET_REMOVAL:CELLBENDER_REMOVEBACKGROUND -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:EMPTY_DROPLET_REMOVAL:ADATA_BARCODES              -
[3a/e9f636] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:GET_FILTERED_SIZE (SAMN14430801)                  [100%] 2 of 2 ✔
[a8/06eb8a] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:QC_RAW (SAMN14430801)                             [100%] 2 of 2 ✔
[e2/1b9140] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:AMBIENT_RNA_REMOVAL:CELDA_DECONTX (SAMN14430801)  [100%] 2 of 2 ✔
[3c/f9170e] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:SCANPY_FILTER (SAMN14430801)                      [ 50%] 1 of 2
[00/31bc89] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:GET_THRESHOLDED_SIZE (SAMN14430799)               [100%] 1 of 1
[29/823820] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:DOUBLET_DETECTION:SCANPY_SCRUBLET (SAMN14430799)  [  0%] 0 of 1
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:DOUBLET_DETECTION:DOUBLET_REMOVAL                 -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:GET_DEDOUBLETED_SIZE                              -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:QC_FILTERED                                       -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:COLLECT_SIZES                                     -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:COMBINE:ADATA_MERGE                                          -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:COMBINE:ADATA_UPSETGENES                                     -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:COMBINE:INTEGRATE:SCANPY_HVGS                                -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:COMBINE:INTEGRATE:SCVITOOLS_SCVI                             -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:CLUSTER:SCANPY_NEIGHBORS                                     -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:CLUSTER:SCANPY_UMAP                                          -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:CLUSTER:SCANPY_LEIDEN                                        -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:CLUSTER:SCANPY_PAGA                                          -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:CLUSTER:SCANPY_RANKGENESGROUPS                               -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:FINALIZE:ADATA_EXTEND                                        -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:FINALIZE:ADATA_TORDS                                         -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:MULTIQC                                                      -
ERROR ~ Error executing process > 'NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:DOUBLET_DETECTION:SCANPY_SCRUBLET (SAMN14430799)'

Caused by:
  Essential container in task exited

Command executed [/home/robert/repos/scdownstream/./workflows/../subworkflows/local/./../../modules/local/scanpy/scrublet/templates/scrublet.py]:

  #!/usr/bin/env python3

  import scanpy as sc
  import platform
  from threadpoolctl import threadpool_limits
  threadpool_limits(int("6"))
  sc.settings.n_jobs = int("6")

  def format_yaml_like(data: dict, indent: int = 0) -> str:
      """Formats a dictionary to a YAML-like string.

      Args:
          data (dict): The dictionary to format.
          indent (int): The current indentation level.

      Returns:
          str: A string formatted as YAML.
      """
      yaml_str = ""
      for key, value in data.items():
          spaces = "  " * indent
          if isinstance(value, dict):
              yaml_str += f"{spaces}{key}:\n{format_yaml_like(value, indent + 1)}"
          else:
              yaml_str += f"{spaces}{key}: {value}\n"
      return yaml_str

  adata = sc.read_h5ad("SAMN14430799_filtered.h5ad")
  prefix = "SAMN14430799_scrublet"
  use_gpu = "true" == "true"

  if use_gpu:
      import rapids_singlecell as rsc
      import rmm
      from rmm.allocators.cupy import rmm_cupy_allocator
      import cupy as cp

      rmm.reinitialize(
          managed_memory=True,
          pool_allocator=False,
      )
      cp.cuda.set_allocator(rmm_cupy_allocator)

      rsc.get.anndata_to_GPU(adata)
executor >  awsbatch (18)
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:SCANPY_READH5                                     -
[b1/44af75] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:ADATA_READRDS (SAMN14430801)                      [100%] 2 of 2 ✔
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:ADATA_READCSV                                     -
[12/b6da9a] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:ADATA_UNIFY (SAMN14430801)                        [100%] 4 of 4 ✔
[81/633761] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:GET_UNFILTERED_SIZE (SAMN14430801)                [100%] 2 of 2 ✔
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:EMPTY_DROPLET_REMOVAL:CELLBENDER_REMOVEBACKGROUND -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:EMPTY_DROPLET_REMOVAL:ADATA_BARCODES              -
[3a/e9f636] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:GET_FILTERED_SIZE (SAMN14430801)                  [100%] 2 of 2 ✔
[a8/06eb8a] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:QC_RAW (SAMN14430801)                             [100%] 2 of 2 ✔
[e2/1b9140] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:AMBIENT_RNA_REMOVAL:CELDA_DECONTX (SAMN14430801)  [100%] 2 of 2 ✔
[3c/f9170e] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:SCANPY_FILTER (SAMN14430801)                      [ 50%] 1 of 2
[00/31bc89] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:GET_THRESHOLDED_SIZE (SAMN14430799)               [100%] 1 of 1
[29/823820] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:DOUBLET_DETECTION:SCANPY_SCRUBLET (SAMN14430799)  [100%] 1 of 1, failed: 1
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:DOUBLET_DETECTION:DOUBLET_REMOVAL                 -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:GET_DEDOUBLETED_SIZE                              -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:QC_FILTERED                                       -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:COLLECT_SIZES                                     -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:COMBINE:ADATA_MERGE                                          -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:COMBINE:ADATA_UPSETGENES                                     -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:COMBINE:INTEGRATE:SCANPY_HVGS                                -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:COMBINE:INTEGRATE:SCVITOOLS_SCVI                             -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:CLUSTER:SCANPY_NEIGHBORS                                     -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:CLUSTER:SCANPY_UMAP                                          -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:CLUSTER:SCANPY_LEIDEN                                        -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:CLUSTER:SCANPY_PAGA                                          -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:CLUSTER:SCANPY_RANKGENESGROUPS                               -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:FINALIZE:ADATA_EXTEND                                        -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:FINALIZE:ADATA_TORDS                                         -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:MULTIQC                                                      -
Execution cancelled -- Finishing pending tasks before exit
ERROR ~ Error executing process > 'NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:DOUBLET_DETECTION:SCANPY_SCRUBLET (SAMN14430799)'

Caused by:
  Essential container in task exited

Command executed [/home/robert/repos/scdownstream/./workflows/../subworkflows/local/./../../modules/local/scanpy/scrublet/templates/scrublet.py]:

  #!/usr/bin/env python3

  import scanpy as sc
  import platform
  from threadpoolctl import threadpool_limits
  threadpool_limits(int("6"))
  sc.settings.n_jobs = int("6")

  def format_yaml_like(data: dict, indent: int = 0) -> str:
      """Formats a dictionary to a YAML-like string.

      Args:
          data (dict): The dictionary to format.
          indent (int): The current indentation level.

      Returns:
          str: A string formatted as YAML.
      """
      yaml_str = ""
      for key, value in data.items():
          spaces = "  " * indent
          if isinstance(value, dict):
              yaml_str += f"{spaces}{key}:\n{format_yaml_like(value, indent + 1)}"
          else:
              yaml_str += f"{spaces}{key}: {value}\n"
      return yaml_str

  adata = sc.read_h5ad("SAMN14430799_filtered.h5ad")
  prefix = "SAMN14430799_scrublet"
  use_gpu = "true" == "true"

  if use_gpu:
      import rapids_singlecell as rsc
      import rmm
      from rmm.allocators.cupy import rmm_cupy_allocator
      import cupy as cp

      rmm.reinitialize(
          managed_memory=True,
          pool_allocator=False,
      )
      cp.cuda.set_allocator(rmm_cupy_allocator)

      rsc.get.anndata_to_GPU(adata)
executor >  awsbatch (18)
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:SCANPY_READH5                                     -
[b1/44af75] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:ADATA_READRDS (SAMN14430801)                      [100%] 2 of 2 ✔
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:ADATA_READCSV                                     -
[12/b6da9a] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:ADATA_UNIFY (SAMN14430801)                        [100%] 4 of 4 ✔
[81/633761] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:GET_UNFILTERED_SIZE (SAMN14430801)                [100%] 2 of 2 ✔
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:EMPTY_DROPLET_REMOVAL:CELLBENDER_REMOVEBACKGROUND -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:EMPTY_DROPLET_REMOVAL:ADATA_BARCODES              -
[3a/e9f636] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:GET_FILTERED_SIZE (SAMN14430801)                  [100%] 2 of 2 ✔
[a8/06eb8a] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:QC_RAW (SAMN14430801)                             [100%] 2 of 2 ✔
[e2/1b9140] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:AMBIENT_RNA_REMOVAL:CELDA_DECONTX (SAMN14430801)  [100%] 2 of 2 ✔
[3c/f9170e] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:SCANPY_FILTER (SAMN14430801)                      [ 50%] 1 of 2
[00/31bc89] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:GET_THRESHOLDED_SIZE (SAMN14430799)               [100%] 1 of 1
[29/823820] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:DOUBLET_DETECTION:SCANPY_SCRUBLET (SAMN14430799)  [100%] 1 of 1, failed: 1
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:DOUBLET_DETECTION:DOUBLET_REMOVAL                 -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:GET_DEDOUBLETED_SIZE                              -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:QC_FILTERED                                       -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:COLLECT_SIZES                                     -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:COMBINE:ADATA_MERGE                                          -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:COMBINE:ADATA_UPSETGENES                                     -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:COMBINE:INTEGRATE:SCANPY_HVGS                                -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:COMBINE:INTEGRATE:SCVITOOLS_SCVI                             -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:CLUSTER:SCANPY_NEIGHBORS                                     -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:CLUSTER:SCANPY_UMAP                                          -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:CLUSTER:SCANPY_LEIDEN                                        -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:CLUSTER:SCANPY_PAGA                                          -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:CLUSTER:SCANPY_RANKGENESGROUPS                               -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:FINALIZE:ADATA_EXTEND                                        -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:FINALIZE:ADATA_TORDS                                         -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:MULTIQC                                                      -
Execution cancelled -- Finishing pending tasks before exit
ERROR ~ Error executing process > 'NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:DOUBLET_DETECTION:SCANPY_SCRUBLET (SAMN14430799)'

Caused by:
  Essential container in task exited

Command executed [/home/robert/repos/scdownstream/./workflows/../subworkflows/local/./../../modules/local/scanpy/scrublet/templates/scrublet.py]:

  #!/usr/bin/env python3

  import scanpy as sc
  import platform
  from threadpoolctl import threadpool_limits
  threadpool_limits(int("6"))
  sc.settings.n_jobs = int("6")

  def format_yaml_like(data: dict, indent: int = 0) -> str:
      """Formats a dictionary to a YAML-like string.

      Args:
          data (dict): The dictionary to format.
          indent (int): The current indentation level.

      Returns:
          str: A string formatted as YAML.
      """
      yaml_str = ""
      for key, value in data.items():
          spaces = "  " * indent
          if isinstance(value, dict):
              yaml_str += f"{spaces}{key}:\n{format_yaml_like(value, indent + 1)}"
          else:
              yaml_str += f"{spaces}{key}: {value}\n"
      return yaml_str

  adata = sc.read_h5ad("SAMN14430799_filtered.h5ad")
  prefix = "SAMN14430799_scrublet"
  use_gpu = "true" == "true"

  if use_gpu:
      import rapids_singlecell as rsc
      import rmm
      from rmm.allocators.cupy import rmm_cupy_allocator
      import cupy as cp

      rmm.reinitialize(
          managed_memory=True,
          pool_allocator=False,
      )
      cp.cuda.set_allocator(rmm_cupy_allocator)

      rsc.get.anndata_to_GPU(adata)
      rsc.pp.scrublet(adata, batch_key="batch")
      rsc.get.anndata_to_CPU(adata)
  else:
executor >  awsbatch (18)
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:SCANPY_READH5                                     -
[b1/44af75] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:ADATA_READRDS (SAMN14430801)                      [100%] 2 of 2 ✔
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:ADATA_READCSV                                     -
[12/b6da9a] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:ADATA_UNIFY (SAMN14430801)                        [100%] 4 of 4 ✔
[81/633761] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:GET_UNFILTERED_SIZE (SAMN14430801)                [100%] 2 of 2 ✔
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:EMPTY_DROPLET_REMOVAL:CELLBENDER_REMOVEBACKGROUND -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:EMPTY_DROPLET_REMOVAL:ADATA_BARCODES              -
[3a/e9f636] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:GET_FILTERED_SIZE (SAMN14430801)                  [100%] 2 of 2 ✔
[a8/06eb8a] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:QC_RAW (SAMN14430801)                             [100%] 2 of 2 ✔
[e2/1b9140] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:AMBIENT_RNA_REMOVAL:CELDA_DECONTX (SAMN14430801)  [100%] 2 of 2 ✔
[3c/f9170e] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:SCANPY_FILTER (SAMN14430801)                      [ 50%] 1 of 2
[00/31bc89] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:GET_THRESHOLDED_SIZE (SAMN14430799)               [100%] 1 of 1
[29/823820] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:DOUBLET_DETECTION:SCANPY_SCRUBLET (SAMN14430799)  [100%] 1 of 1, failed: 1
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:DOUBLET_DETECTION:DOUBLET_REMOVAL                 -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:GET_DEDOUBLETED_SIZE                              -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:QC_FILTERED                                       -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:COLLECT_SIZES                                     -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:COMBINE:ADATA_MERGE                                          -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:COMBINE:ADATA_UPSETGENES                                     -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:COMBINE:INTEGRATE:SCANPY_HVGS                                -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:COMBINE:INTEGRATE:SCVITOOLS_SCVI                             -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:CLUSTER:SCANPY_NEIGHBORS                                     -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:CLUSTER:SCANPY_UMAP                                          -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:CLUSTER:SCANPY_LEIDEN                                        -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:CLUSTER:SCANPY_PAGA                                          -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:CLUSTER:SCANPY_RANKGENESGROUPS                               -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:FINALIZE:ADATA_EXTEND                                        -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:FINALIZE:ADATA_TORDS                                         -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:MULTIQC                                                      -

ERROR ~ Error executing process > 'NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:DOUBLET_DETECTION:SCANPY_SCRUBLET (SAMN14430799)'

Caused by:
  Essential container in task exited

Command executed [/home/robert/repos/scdownstream/./workflows/../subworkflows/local/./../../modules/local/scanpy/scrublet/templates/scrublet.py]:

  #!/usr/bin/env python3

  import scanpy as sc
  import platform
  from threadpoolctl import threadpool_limits
  threadpool_limits(int("6"))
  sc.settings.n_jobs = int("6")

  def format_yaml_like(data: dict, indent: int = 0) -> str:
      """Formats a dictionary to a YAML-like string.

      Args:
          data (dict): The dictionary to format.
          indent (int): The current indentation level.

      Returns:
          str: A string formatted as YAML.
      """
      yaml_str = ""
      for key, value in data.items():
          spaces = "  " * indent
          if isinstance(value, dict):
              yaml_str += f"{spaces}{key}:\n{format_yaml_like(value, indent + 1)}"
          else:
              yaml_str += f"{spaces}{key}: {value}\n"
      return yaml_str

  adata = sc.read_h5ad("SAMN14430799_filtered.h5ad")
  prefix = "SAMN14430799_scrublet"
  use_gpu = "true" == "true"

  if use_gpu:
      import rapids_singlecell as rsc
      import rmm
      from rmm.allocators.cupy import rmm_cupy_allocator
      import cupy as cp

      rmm.reinitialize(
          managed_memory=True,
          pool_allocator=False,
      )
      cp.cuda.set_allocator(rmm_cupy_allocator)

      rsc.get.anndata_to_GPU(adata)
      rsc.pp.scrublet(adata, batch_key="batch")
      rsc.get.anndata_to_CPU(adata)
  else:
executor >  awsbatch (18)
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:SCANPY_READH5                                     -
[b1/44af75] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:ADATA_READRDS (SAMN14430801)                      [100%] 2 of 2 ✔
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:ADATA_READCSV                                     -
[12/b6da9a] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:ADATA_UNIFY (SAMN14430801)                        [100%] 4 of 4 ✔
[81/633761] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:GET_UNFILTERED_SIZE (SAMN14430801)                [100%] 2 of 2 ✔
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:EMPTY_DROPLET_REMOVAL:CELLBENDER_REMOVEBACKGROUND -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:EMPTY_DROPLET_REMOVAL:ADATA_BARCODES              -
[3a/e9f636] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:GET_FILTERED_SIZE (SAMN14430801)                  [100%] 2 of 2 ✔
[a8/06eb8a] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:QC_RAW (SAMN14430801)                             [100%] 2 of 2 ✔
[e2/1b9140] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:AMBIENT_RNA_REMOVAL:CELDA_DECONTX (SAMN14430801)  [100%] 2 of 2 ✔
[3c/f9170e] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:SCANPY_FILTER (SAMN14430801)                      [ 50%] 1 of 2 ✔
[00/31bc89] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:GET_THRESHOLDED_SIZE (SAMN14430799)               [100%] 1 of 1
[29/823820] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:DOUBLET_DETECTION:SCANPY_SCRUBLET (SAMN14430799)  [100%] 1 of 1, failed: 1
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:DOUBLET_DETECTION:DOUBLET_REMOVAL                 -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:GET_DEDOUBLETED_SIZE                              -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:QC_FILTERED                                       -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:COLLECT_SIZES                                     -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:COMBINE:ADATA_MERGE                                          -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:COMBINE:ADATA_UPSETGENES                                     -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:COMBINE:INTEGRATE:SCANPY_HVGS                                -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:COMBINE:INTEGRATE:SCVITOOLS_SCVI                             -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:CLUSTER:SCANPY_NEIGHBORS                                     -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:CLUSTER:SCANPY_UMAP                                          -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:CLUSTER:SCANPY_LEIDEN                                        -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:CLUSTER:SCANPY_PAGA                                          -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:CLUSTER:SCANPY_RANKGENESGROUPS                               -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:FINALIZE:ADATA_EXTEND                                        -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:FINALIZE:ADATA_TORDS                                         -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:MULTIQC                                                      -
ERROR ~ Error executing process > 'NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:DOUBLET_DETECTION:SCANPY_SCRUBLET (SAMN14430799)'

Caused by:
  Essential container in task exited

Command executed [/home/robert/repos/scdownstream/./workflows/../subworkflows/local/./../../modules/local/scanpy/scrublet/templates/scrublet.py]:

  #!/usr/bin/env python3

  import scanpy as sc
  import platform
  from threadpoolctl import threadpool_limits
  threadpool_limits(int("6"))
  sc.settings.n_jobs = int("6")

  def format_yaml_like(data: dict, indent: int = 0) -> str:
      """Formats a dictionary to a YAML-like string.

      Args:
          data (dict): The dictionary to format.
          indent (int): The current indentation level.

      Returns:
          str: A string formatted as YAML.
      """
      yaml_str = ""
      for key, value in data.items():
          spaces = "  " * indent
          if isinstance(value, dict):
              yaml_str += f"{spaces}{key}:\n{format_yaml_like(value, indent + 1)}"
          else:
              yaml_str += f"{spaces}{key}: {value}\n"
      return yaml_str

  adata = sc.read_h5ad("SAMN14430799_filtered.h5ad")
  prefix = "SAMN14430799_scrublet"
  use_gpu = "true" == "true"

  if use_gpu:
      import rapids_singlecell as rsc
      import rmm
      from rmm.allocators.cupy import rmm_cupy_allocator
      import cupy as cp

      rmm.reinitialize(
          managed_memory=True,
          pool_allocator=False,
      )
      cp.cuda.set_allocator(rmm_cupy_allocator)

      rsc.get.anndata_to_GPU(adata)
      rsc.pp.scrublet(adata, batch_key="batch")
      rsc.get.anndata_to_CPU(adata)
  else:
      sc.pp.scrublet(adata, batch_key="batch")
executor >  awsbatch (18)
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:SCANPY_READH5                                     -
[b1/44af75] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:ADATA_READRDS (SAMN14430801)                      [100%] 2 of 2 ✔
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:ADATA_READCSV                                     -
[12/b6da9a] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:ADATA_UNIFY (SAMN14430801)                        [100%] 4 of 4 ✔
[81/633761] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:GET_UNFILTERED_SIZE (SAMN14430801)                [100%] 2 of 2 ✔
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:EMPTY_DROPLET_REMOVAL:CELLBENDER_REMOVEBACKGROUND -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:EMPTY_DROPLET_REMOVAL:ADATA_BARCODES              -
[3a/e9f636] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:GET_FILTERED_SIZE (SAMN14430801)                  [100%] 2 of 2 ✔
[a8/06eb8a] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:QC_RAW (SAMN14430801)                             [100%] 2 of 2 ✔
[e2/1b9140] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:AMBIENT_RNA_REMOVAL:CELDA_DECONTX (SAMN14430801)  [100%] 2 of 2 ✔
[3c/f9170e] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:SCANPY_FILTER (SAMN14430801)                      [100%] 2 of 2 ✔
[00/31bc89] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:GET_THRESHOLDED_SIZE (SAMN14430799)               [100%] 1 of 1
[29/823820] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:DOUBLET_DETECTION:SCANPY_SCRUBLET (SAMN14430799)  [100%] 1 of 1, failed: 1
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:DOUBLET_DETECTION:DOUBLET_REMOVAL                 -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:GET_DEDOUBLETED_SIZE                              -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:QC_FILTERED                                       -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:COLLECT_SIZES                                     -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:COMBINE:ADATA_MERGE                                          -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:COMBINE:ADATA_UPSETGENES                                     -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:COMBINE:INTEGRATE:SCANPY_HVGS                                -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:COMBINE:INTEGRATE:SCVITOOLS_SCVI                             -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:CLUSTER:SCANPY_NEIGHBORS                                     -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:CLUSTER:SCANPY_UMAP                                          -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:CLUSTER:SCANPY_LEIDEN                                        -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:CLUSTER:SCANPY_PAGA                                          -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:CLUSTER:SCANPY_RANKGENESGROUPS                               -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:FINALIZE:ADATA_EXTEND                                        -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:FINALIZE:ADATA_TORDS                                         -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:MULTIQC                                                      -
ERROR ~ Error executing process > 'NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:DOUBLET_DETECTION:SCANPY_SCRUBLET (SAMN14430799)'

Caused by:
  Essential container in task exited

Command executed [/home/robert/repos/scdownstream/./workflows/../subworkflows/local/./../../modules/local/scanpy/scrublet/templates/scrublet.py]:

  #!/usr/bin/env python3

  import scanpy as sc
  import platform
  from threadpoolctl import threadpool_limits
  threadpool_limits(int("6"))
  sc.settings.n_jobs = int("6")

  def format_yaml_like(data: dict, indent: int = 0) -> str:
      """Formats a dictionary to a YAML-like string.

      Args:
          data (dict): The dictionary to format.
          indent (int): The current indentation level.

      Returns:
          str: A string formatted as YAML.
      """
      yaml_str = ""
      for key, value in data.items():
          spaces = "  " * indent
          if isinstance(value, dict):
              yaml_str += f"{spaces}{key}:\n{format_yaml_like(value, indent + 1)}"
          else:
              yaml_str += f"{spaces}{key}: {value}\n"
      return yaml_str

  adata = sc.read_h5ad("SAMN14430799_filtered.h5ad")
  prefix = "SAMN14430799_scrublet"
  use_gpu = "true" == "true"

  if use_gpu:
      import rapids_singlecell as rsc
      import rmm
      from rmm.allocators.cupy import rmm_cupy_allocator
      import cupy as cp

      rmm.reinitialize(
          managed_memory=True,
          pool_allocator=False,
      )
      cp.cuda.set_allocator(rmm_cupy_allocator)

      rsc.get.anndata_to_GPU(adata)
      rsc.pp.scrublet(adata, batch_key="batch")
      rsc.get.anndata_to_CPU(adata)
  else:
executor >  awsbatch (18)
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:SCANPY_READH5                                     -
[b1/44af75] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:ADATA_READRDS (SAMN14430801)                      [100%] 2 of 2 ✔
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:ADATA_READCSV                                     -
[12/b6da9a] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:ADATA_UNIFY (SAMN14430801)                        [100%] 4 of 4 ✔
[81/633761] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:GET_UNFILTERED_SIZE (SAMN14430801)                [100%] 2 of 2 ✔
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:EMPTY_DROPLET_REMOVAL:CELLBENDER_REMOVEBACKGROUND -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:EMPTY_DROPLET_REMOVAL:ADATA_BARCODES              -
[3a/e9f636] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:GET_FILTERED_SIZE (SAMN14430801)                  [100%] 2 of 2 ✔
[a8/06eb8a] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:QC_RAW (SAMN14430801)                             [100%] 2 of 2 ✔
[e2/1b9140] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:AMBIENT_RNA_REMOVAL:CELDA_DECONTX (SAMN14430801)  [100%] 2 of 2 ✔
[3c/f9170e] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:SCANPY_FILTER (SAMN14430801)                      [100%] 2 of 2 ✔
[00/31bc89] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:GET_THRESHOLDED_SIZE (SAMN14430799)               [100%] 1 of 1
[29/823820] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:DOUBLET_DETECTION:SCANPY_SCRUBLET (SAMN14430799)  [100%] 1 of 1, failed: 1
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:DOUBLET_DETECTION:DOUBLET_REMOVAL                 -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:GET_DEDOUBLETED_SIZE                              -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:QC_FILTERED                                       -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:COLLECT_SIZES                                     -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:COMBINE:ADATA_MERGE                                          -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:COMBINE:ADATA_UPSETGENES                                     -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:COMBINE:INTEGRATE:SCANPY_HVGS                                -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:COMBINE:INTEGRATE:SCVITOOLS_SCVI                             -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:CLUSTER:SCANPY_NEIGHBORS                                     -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:CLUSTER:SCANPY_UMAP                                          -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:CLUSTER:SCANPY_LEIDEN                                        -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:CLUSTER:SCANPY_PAGA                                          -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:CLUSTER:SCANPY_RANKGENESGROUPS                               -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:FINALIZE:ADATA_EXTEND                                        -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:FINALIZE:ADATA_TORDS                                         -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:MULTIQC                                                      -
-[nf-core/scdownstream] Pipeline completed with errors-
ERROR ~ Error executing process > 'NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:DOUBLET_DETECTION:SCANPY_SCRUBLET (SAMN14430799)'

Caused by:
  Essential container in task exited

Command executed [/home/robert/repos/scdownstream/./workflows/../subworkflows/local/./../../modules/local/scanpy/scrublet/templates/scrublet.py]:

  #!/usr/bin/env python3

  import scanpy as sc
  import platform
  from threadpoolctl import threadpool_limits
  threadpool_limits(int("6"))
  sc.settings.n_jobs = int("6")

  def format_yaml_like(data: dict, indent: int = 0) -> str:
      """Formats a dictionary to a YAML-like string.

      Args:
          data (dict): The dictionary to format.
          indent (int): The current indentation level.

      Returns:
          str: A string formatted as YAML.
      """
      yaml_str = ""
      for key, value in data.items():
          spaces = "  " * indent
          if isinstance(value, dict):
              yaml_str += f"{spaces}{key}:\n{format_yaml_like(value, indent + 1)}"
          else:
              yaml_str += f"{spaces}{key}: {value}\n"
      return yaml_str

  adata = sc.read_h5ad("SAMN14430799_filtered.h5ad")
  prefix = "SAMN14430799_scrublet"
  use_gpu = "true" == "true"

  if use_gpu:
      import rapids_singlecell as rsc
      import rmm
      from rmm.allocators.cupy import rmm_cupy_allocator
      import cupy as cp

      rmm.reinitialize(
          managed_memory=True,
          pool_allocator=False,
      )
      cp.cuda.set_allocator(rmm_cupy_allocator)

      rsc.get.anndata_to_GPU(adata)
      rsc.pp.scrublet(adata, batch_key="batch")
      rsc.get.anndata_to_CPU(adata)
  else:
executor >  awsbatch (18)
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:SCANPY_READH5                                     -
[b1/44af75] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:ADATA_READRDS (SAMN14430801)                      [100%] 2 of 2 ✔
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:ADATA_READCSV                                     -
[12/b6da9a] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:ADATA_UNIFY (SAMN14430801)                        [100%] 4 of 4 ✔
[81/633761] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:GET_UNFILTERED_SIZE (SAMN14430801)                [100%] 2 of 2 ✔
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:EMPTY_DROPLET_REMOVAL:CELLBENDER_REMOVEBACKGROUND -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:EMPTY_DROPLET_REMOVAL:ADATA_BARCODES              -
[3a/e9f636] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:GET_FILTERED_SIZE (SAMN14430801)                  [100%] 2 of 2 ✔
[a8/06eb8a] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:QC_RAW (SAMN14430801)                             [100%] 2 of 2 ✔
[e2/1b9140] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:AMBIENT_RNA_REMOVAL:CELDA_DECONTX (SAMN14430801)  [100%] 2 of 2 ✔
[3c/f9170e] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:SCANPY_FILTER (SAMN14430801)                      [100%] 2 of 2 ✔
[00/31bc89] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:GET_THRESHOLDED_SIZE (SAMN14430799)               [100%] 1 of 1
[29/823820] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:DOUBLET_DETECTION:SCANPY_SCRUBLET (SAMN14430799)  [100%] 1 of 1, failed: 1
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:DOUBLET_DETECTION:DOUBLET_REMOVAL                 -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:GET_DEDOUBLETED_SIZE                              -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:QC_FILTERED                                       -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:COLLECT_SIZES                                     -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:COMBINE:ADATA_MERGE                                          -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:COMBINE:ADATA_UPSETGENES                                     -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:COMBINE:INTEGRATE:SCANPY_HVGS                                -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:COMBINE:INTEGRATE:SCVITOOLS_SCVI                             -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:CLUSTER:SCANPY_NEIGHBORS                                     -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:CLUSTER:SCANPY_UMAP                                          -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:CLUSTER:SCANPY_LEIDEN                                        -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:CLUSTER:SCANPY_PAGA                                          -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:CLUSTER:SCANPY_RANKGENESGROUPS                               -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:FINALIZE:ADATA_EXTEND                                        -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:FINALIZE:ADATA_TORDS                                         -
[-        ] process > NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:MULTIQC                                                      -
-[nf-core/scdownstream] Pipeline completed with errors-
ERROR ~ Error executing process > 'NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:DOUBLET_DETECTION:SCANPY_SCRUBLET (SAMN14430799)'

Caused by:
  Essential container in task exited

Command executed [/home/robert/repos/scdownstream/./workflows/../subworkflows/local/./../../modules/local/scanpy/scrublet/templates/scrublet.py]:

  #!/usr/bin/env python3

  import scanpy as sc
  import platform
  from threadpoolctl import threadpool_limits
  threadpool_limits(int("6"))
  sc.settings.n_jobs = int("6")

  def format_yaml_like(data: dict, indent: int = 0) -> str:
      """Formats a dictionary to a YAML-like string.

      Args:
          data (dict): The dictionary to format.
          indent (int): The current indentation level.

      Returns:
          str: A string formatted as YAML.
      """
      yaml_str = ""
      for key, value in data.items():
          spaces = "  " * indent
          if isinstance(value, dict):
              yaml_str += f"{spaces}{key}:\n{format_yaml_like(value, indent + 1)}"
          else:
              yaml_str += f"{spaces}{key}: {value}\n"
      return yaml_str

  adata = sc.read_h5ad("SAMN14430799_filtered.h5ad")
  prefix = "SAMN14430799_scrublet"
  use_gpu = "true" == "true"

  if use_gpu:
      import rapids_singlecell as rsc
      import rmm
      from rmm.allocators.cupy import rmm_cupy_allocator
      import cupy as cp

      rmm.reinitialize(
          managed_memory=True,
          pool_allocator=False,
      )
      cp.cuda.set_allocator(rmm_cupy_allocator)

      rsc.get.anndata_to_GPU(adata)
      rsc.pp.scrublet(adata, batch_key="batch")
      rsc.get.anndata_to_CPU(adata)
  else:
      sc.pp.scrublet(adata, batch_key="batch")

  df = adata.obs[["predicted_doublet"]]
  df.columns = ["SAMN14430799_scrublet"]
  df.to_pickle("SAMN14430799_scrublet.pkl")

  adata = adata[~adata.obs["predicted_doublet"]].copy()

  adata.write_h5ad(f"{prefix}.h5ad")

  # Versions

  versions = {
      "NFCORE_SCDOWNSTREAM:SCDOWNSTREAM:PREPROCESS:DOUBLET_DETECTION:SCANPY_SCRUBLET": {
          "python": platform.python_version(),
          "scanpy": sc.__version__
      }
  }

  with open("versions.yml", "w") as f:
      f.write(format_yaml_like(versions))

Command exit status:
  1

Command output:
  (empty)

Command error:
  Traceback (most recent call last):
    File "/opt/conda/bin/ipython", line 6, in <module>
      from IPython import start_ipython
    File "/opt/conda/lib/python3.11/site-packages/IPython/__init__.py", line 55, in <module>
      from .terminal.embed import embed
    File "/opt/conda/lib/python3.11/site-packages/IPython/terminal/embed.py", line 15, in <module>
      from IPython.core.interactiveshell import DummyMod, InteractiveShell
    File "/opt/conda/lib/python3.11/site-packages/IPython/core/interactiveshell.py", line 110, in <module>
      from IPython.core.history import HistoryManager
    File "/opt/conda/lib/python3.11/site-packages/IPython/core/history.py", line 10, in <module>
      import sqlite3
    File "/opt/conda/lib/python3.11/sqlite3/__init__.py", line 57, in <module>
      from sqlite3.dbapi2 import *
    File "/opt/conda/lib/python3.11/sqlite3/dbapi2.py", line 27, in <module>
      from _sqlite3 import *
  ImportError: /opt/conda/lib/python3.11/lib-dynload/_sqlite3.cpython-311-x86_64-linux-gnu.so: undefined symbol: sqlite3_deserialize

Work dir:
  s3://ktmp/nextflow-work/scrnaseq_work/29/82382049c57c69b77be4ff4f368fd6

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`

 -- Check '.nextflow.log' file for details
ERROR ~ Pipeline failed. Please refer to troubleshooting docs: https://nf-co.re/docs/usage/troubleshooting

 -- Check '.nextflow.log' file for details

Relevant files

nextflow_log_and_config.zip

System information

nictru commented 1 day ago

Thanks for the issue, I will look into it!