MarioniLab / miloR

R package implementation of Milo for testing for differential abundance in KNN graphs
https://bioconductor.org/packages/release/bioc/html/miloR.html
GNU General Public License v3.0
316 stars 20 forks source link

makeNhoods missing cell ids #312

Closed flde closed 3 months ago

flde commented 3 months ago

Dear all,

many thanks for developing and maintaining miloR!

When running makeNhoods the cell id x hood id matrix has fewer rows then the colData matrix. In other words, some cells ids are missing. I did a grid search and for lower k I lost more cell ids. Is that a consequence of cells being singlets in the neighborhood graph?

In some cases that behavior can lead to samples with few cells having zero counts in colSums(nhoodCounts(x)). Hence, the samples cannot be normalized with edgeR and need to be filtered out. Usually, I think that is no problem, but I am working with time course data of transplant patients after chemotherapy and want to keep samples with very low cell counts because that is a biological meaningful signature and loosing those samples due to zero counts reduce the power of the analysis.

I found a workaround, but it would be great to understand why some cell ids cannot be allocated to neighborhoods. I was only expecting the behavior that cell ids can be counted in multiple neighborhoods.

Many thanks and kind regards,

Florian

MikeDMorgan commented 3 months ago

Hi @flde Could you provide some example code, so we can get an idea of what parameter values you are providing to Milo. For instance, what k and p are you using?

flde commented 3 months ago

Hello @MikeDMorgan, many thanks for your help. I used k between 10 - 100 in 10 intervals and p=0.1. I ran the analysis with the mouse gastrulation data and could not reproduce the error. I went back to my original data and found a bug. Like most of the time the problem sits in front of the computer.