scverse / scanpy

Single-cell analysis in Python. Scales to >1M cells.
https://scanpy.readthedocs.io
BSD 3-Clause "New" or "Revised" License
1.82k stars 586 forks source link

min_in_group_fraction in filter_rank_genes_groups doesn't work as expected #1495

Open nhyda opened 3 years ago

nhyda commented 3 years ago

the default value of min_in_group_fraction is 0.25, which I understand is filter those genes that has less than 25 percent present in the group.

I use min_in_group_fraction = 0, max_in_group_fraction=1.01 to try to filter everything just by foldchange and adj_p value, but it doesn't add any gene when comparing to min_in_group_franction=0.25,

in my test, if I filter on rank_gene_groups by foldchange and adj_p, I got 87 genes back. but if I try mini_in_group_franction=0 in filter_ranK_gene_groups, I only get 25 back.

I notice an issue https://github.com/theislab/scanpy/issues/863 that mentionthat rank_gene_groups and filter_rank_gene_groups calculate fold change differently, was wondering

  1. why
  2. this doesn't explain the huge difference between numbers of gene returned by different filter method.
rpeys commented 3 years ago

it would be helpful to see the lines of code you ran in each of the two cases you describe. for example when you say " if I try mini_in_group_franction=0 in filter_ranK_gene_groups, I only get 25 back." --> in this case, did you use the default fold change argument?