Irrationone / cellassign

Automated, probabilistic assignment of cell types in scRNA-seq data
Other
191 stars 79 forks source link

All cells from simulated matrix are assigned to one group with probability 1 #96

Open agalvezm opened 1 year ago

agalvezm commented 1 year ago

Hello,

Thanks so much for the very useful tool! I am having some issues using the scvi implementation of CellAssign and I wanted to share my results with you.

I am trying to reproduce the benchmarking of CellAssign using data simulatted by the adapted version of Splatter that you mention on your paper. I used the following parameters for the simulation:

de.facLoc = 0.1
de.facScale = 0.1
ct_prob = even across all groups (1 / number groups)
de_prob = 0.1
Number of genes: 10,000

I performed simulations for all possible combinations of the following number of cells and groups:

Number of cells: 1000, 2000, 4000, 8000 and 10,000
Number of groups: 2, 4, 6, 8

I followed one of the methods you name of your paper to select marker genes, namely: Markers for CellAssign were selected from genes in the top 20th percentile in terms of log fold change among differentially upregulated genes and the top 10th percentile in terms of expression.

The gene marker matrix therefore contains around ~20 marker genes per group.

When I run CellAssign on any of the 20 simulated matrices, I always get the same result: all cells are assigned to one of the groups with a probability of 1. I am linking a google colab notebook that:

1) Downloads the matrix with 1000 cells and 2 groups; and the gene marker matrix 2) Runs CellAssign using the same commands that you show in your tutorial 3) Inspects the simulated matrix to confirm the format and its content is normal.

The inspections of step 3 includes:

We have tried a number of things to fix the problem. This includes:

Nothing seems to modify the behaviour.

Any help on this would be greatly appreciated. Thanks so much!!