ebi-gene-expression-group / scanpy-scripts

Scripts for using scanpy
Apache License 2.0
29 stars 13 forks source link

Using a single group in find marker genes fails #123

Open pcm32 opened 1 year ago

pcm32 commented 1 year ago

Calling:

scanpy-find-markers --save diffexp.tsv --n-genes '100' --groupby 'Original_sample_ID' --key-added 'markers_Original_sample_ID' --method 'wilcoxon' --use-raw  --groups 'MyGroupA' --reference 'MyRef' --filter-params 'min_in_group_fraction:0.0,max_out_group_fraction:1.0,min_fold_change:1.0'   --input-format 'anndata' proj.h5ad   --show-obj stdout --output-format anndata output.h5

fails with:

Traceback (most recent call last):
  File "/usr/local/bin/scanpy-find-markers", line 10, in <module>
    sys.exit(DIFFEXP_CMD())
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/scanpy_scripts/cmd_utils.py", line 48, in cmd
    func(adata, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/scanpy_scripts/lib/_diffexp.py", line 54, in diffexp
    sc.tl.rank_genes_groups(
  File "/usr/local/lib/python3.9/site-packages/scanpy/tools/_rank_genes_groups.py", line 565, in rank_genes_groups
    raise ValueError('Specify a sequence of groups')
ValueError: Specify a sequence of groups

This is probably an issue of the intermediate layer that tries to deals with collections and single elements. Calling it with more than one group works:

scanpy-find-markers --save diffexp.tsv --n-genes '100' --groupby 'Original_sample_ID' --key-added 'markers_Original_sample_ID' --method 'wilcoxon' --use-raw  --groups 'MyGroupA' --reference 'MyRef' --filter-params 'min_in_group_fraction:0.0,max_out_group_fraction:1.0,min_fold_change:1.0'   --input-format 'anndata' proj.h5ad   --show-obj stdout --output-format anndata output.h5

but this means longer compute.