vib-singlecell-nf / vsn-pipelines

A repository of pipelines for single-cell data in Nextflow DSL2
GNU General Public License v3.0
75 stars 31 forks source link

SCENIC:MULTI_RUNS_TO_LOOM__TRACK:AUCELL crashes when no signatures passed the filter #68

Closed ghuls closed 4 years ago

ghuls commented 4 years ago

SCENIC:MULTI_RUNS_TO_LOOM__TRACK:AUCELL crashes when no signatures passed the filter.

Error executing process > 'single_sample_scenic:SCENIC_append:SCENIC:MULTI_RUNS_TO_LOOM__TRACK:AUCELL (1)'

Caused by:
  Process `single_sample_scenic:SCENIC_append:SCENIC:MULTI_RUNS_TO_LOOM__TRACK:AUCELL (1)` terminated with an error exit status (1)

Command executed:

  /software/SingleCellTxBenchmark/src/scenic/bin/aucell_from_folder.py         SCC__43bc79__Abdel_S4.filtered.loom         multi_runs_regulons_trk         -o "multi_runs_regulons_auc_trk.tsv"         --min-genes 5         --auc-threshold 0.05                  --min-regulon-gene-occurrence 5         --num-workers 16         --cell-id-attribute CellID         --gene-attribute Gene

Command exit status:
  1

Command output:
  Signatures passed filtering 0 out of 36

Command error:
  /opt/venv/lib/python3.6/site-packages/dask/config.py:161: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
    data = yaml.load(f.read()) or {}
  Traceback (most recent call last):
    File "/staging/leuven/stg_00002/lcb/ghuls/software/SingleCellTxBenchmark/src/scenic/bin/aucell_from_folder.py", line 102, in 
      aucs_mtx = aucell(ex_matrix_df, signatures, auc_threshold=auc_threshold, num_workers=args.num_workers)
    File "/opt/venv/lib/python3.6/site-packages/pyscenic/aucell.py", line 156, in aucell
      return aucell4r(create_rankings(exp_mtx), signatures, auc_threshold, noweights, normalize, num_workers)
    File "/opt/venv/lib/python3.6/site-packages/pyscenic/aucell.py", line 128, in aucell4r
      for idx, chunk in enumerate(chunked(signatures, chunk_size))]
    File "/opt/venv/lib/python3.6/site-packages/boltons/iterutils.py", line 197, in chunked
      return list(chunk_iter)
    File "/opt/venv/lib/python3.6/site-packages/boltons/iterutils.py", line 220, in chunked_iter
      raise ValueError('expected a positive integer chunk size')
  ValueError: expected a positive integer chunk size
dweemx commented 4 years ago

@ghuls, How did you envision to solve this ? I guess a more descriptive error would be enough no ?

ghuls commented 4 years ago

@dweemx I think so. chunked gets 0 as an argument which results in ValueError: expected a positive integer chunk size.

In [1]: from boltons.iterutils import chunked

In [2]: chunked([1,2,3,4], 0)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-3-b6d898d3aeb5> in <module>()
----> 1 chunked([1,2,3,4], 0)

~/anaconda3/lib/python3.6/site-packages/boltons-18.0.1-py3.6.egg/boltons/iterutils.py in chunked(src, size, count, **kw)
    191     chunk_iter = chunked_iter(src, size, **kw)
    192     if count is None:
--> 193         return list(chunk_iter)
    194     else:
    195         return list(itertools.islice(chunk_iter, count))

~/anaconda3/lib/python3.6/site-packages/boltons-18.0.1-py3.6.egg/boltons/iterutils.py in chunked_iter(src, size, **kw)
    214     size = int(size)
    215     if size <= 0:
--> 216         raise ValueError('expected a positive integer chunk size')
    217     do_fill = True
    218     try:

ValueError: expected a positive integer chunk size

The relevant code in pySCENIC: https://github.com/aertslab/pySCENIC/blob/ef162504c021485892be857d6de28eb4f4cb0115/src/pyscenic/aucell.py#L124-L128

ghuls commented 4 years ago

I guess this would fix it:

$ git diff src/scenic/bin/aucell_from_folder.py
diff --git a/src/scenic/bin/aucell_from_folder.py b/src/scenic/bin/aucell_from_folder.py
index fb629a0..c5b7feb 100755
--- a/src/scenic/bin/aucell_from_folder.py
+++ b/src/scenic/bin/aucell_from_folder.py
@@ -55,7 +55,7 @@ parser_grn.add_argument(
     type=int,
     default=5,
     dest="min_regulon_gene_occurrence",
-    help='The threshold used for filtering the genes bases on their occurrence (default: {}).'.format(5)
+    help='The threshold used for filtering the genes based on their occurrence (default: {}).'.format(5)
 )
 parser_grn.add_argument(
     '--num-workers',
@@ -93,6 +93,11 @@ signatures = utils.read_signatures_from_tsv_dir(
     weight_threshold=args.min_regulon_gene_occurrence,
     min_genes=args.min_genes
 )
+
+if len(signatures) == 0:
+    print('Error: No signatures remain after filtering (--min-regulon-gene-occurrence={0:d}).'.format(args.min_regulon_gene_occurrence))
+    sys.exit(1)
+
 auc_threshold = args.auc_threshold

 if args.percentile_threshold is not None:

Not sure if it is OK to write to stdout or that we should use stderr (prefer the later).

dweemx commented 4 years ago

I was planning to add this as a fix:

if len(signatures) == 0:
    raise Exception("No signature passing filtering. Please consider to adapt min_genes_regulon and min_regulon_gene_occurrence (see params.sc.scenic.aucell). Make sure these settings are smaller than numRuns (params.sc.scenic).")

Don't know what's best: sys.exit or raising an Exception ...?

dweemx commented 4 years ago

Changes pushed in version v0.6.0.