------------------------------------------------------------------
Aggregating multiple .h5ad files to Reyfman_healthy-donor.SC__SCANPY__DIM_REDUCTION_UMAP.louvain_0.4.loom
(w/ additional compression)...
------------------------------------------------------------------
WARN: Killing pending tasks (1)
WARN: To render the execution DAG in the required format it is required to install Graphviz -- See http://www.graphviz.org for more info.
Error executing process > 'bbknn_scenic:BBKNN:BEC_BBKNN:CLUSTER_IDENTIFICATION:SC__SCANPY__PARAM_EXPLORE_MARKER_GENES (8)'
Caused by:
Process `bbknn_scenic:BBKNN:BEC_BBKNN:CLUSTER_IDENTIFICATION:SC__SCANPY__PARAM_EXPLORE_MARKER_GENES (8)` terminated with an error exit status (1)
Command executed:
/user/leuven/325/vsc32528/.nextflow/assets/vib-singlecell-nf/vsn-pipelines/src/scanpy/bin/cluster/sc_marker_genes.py --method wilcoxon --groupby louvain --ngenes 0 Reyfman_healthy-donor.SC__SCANPY__DATA_TRANSFORMATION.h5ad Reyfman_healthy-donor.SC__SCANPY__PARAM_EXPLORE_CLUSTERING.h5ad "Reyfman_healthy-donor.SC__SCANPY__PARAM_EXPLORE_MARKER_GENES.louvain_8.0.h5ad"
Command exit status:
1
Command output:
The sc.tl.rank_genes_groups function expects log normalized data. Updating .raw slot by the .X slot from the AnnData generated by the normalize_transform step...
Ranking genes for characterizing groups (aka sc.tl.rank_genes_groups) using wilcoxon method...
Command error:
/opt/venv/lib/python3.7/site-packages/louvain/Optimiser.py:349: SyntaxWarning: assertion is always true, perhaps remove parentheses?
assert(issubclass(partition_type, LinearResolutionParameterVertexPartition),
Traceback (most recent call last):
File "/user/leuven/325/vsc32528/.nextflow/assets/vib-singlecell-nf/vsn-pipelines/src/scanpy/bin/cluster/sc_marker_genes.py", line 94, in <module>
n_genes=ngenes
File "/opt/venv/lib/python3.7/site-packages/scanpy/tools/_rank_genes_groups.py", line 366, in rank_genes_groups
means[imask], vars[imask] = _get_mean_var(X[mask]) #for fold-change
File "/opt/venv/lib/python3.7/site-packages/scanpy/preprocessing/_utils.py", line 18, in _get_mean_var
var = (mean_sq - mean**2) * (X.shape[0]/(X.shape[0]-1))
ZeroDivisionError: division by zero
Work dir:
/ddn1/vol1/staging/leuven/stg_00002/lcb/cflerin/analysis/GSE122960_Reyfman/donor/work/4d/0276b058064fe89ece7e61a3937e3b
Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`
This seems to be a problem internal to the Scanpy function, but it must be related to the input data somehow. The input expression matrix is 41994 cells and 1612 genes.
When I remove the 8.0 resolution from the parameters:
Another solution would be to do the DE with rank_gene_groups for each cluster oneself where you'd then have control and properly handle the case for singlet clusters. For now 415e70f is reasonable.
In the pipeline
v0.16.0
, using the following config:and the
bbknn
entry point, the error is:This seems to be a problem internal to the Scanpy function, but it must be related to the input data somehow. The input expression matrix is 41994 cells and 1612 genes.
When I remove the
8.0
resolution from the parameters:the pipeline completes without error.