MarioniLab / miloR

R package implementation of Milo for testing for differential abundance in KNN graphs
https://bioconductor.org/packages/release/bioc/html/miloR.html
GNU General Public License v3.0
316 stars 20 forks source link

findNhoodGroupMarkers: overlapping nhoods in fake.meta #296

Closed gianfilippo closed 7 months ago

gianfilippo commented 7 months ago

Hi,

I was looking at your function, findNhoodGroupMarkers, and noticed that when you assemble the fake.meta data frame, for (i in seq_along(nhood.gr)) { nhood.x <- which(nhs.da.gr == nhood.gr[i]) nhs <- nhs[rowSums(nhs) > 0, ] nhood.gr.cells <- rowSums(nhs[, nhood.x, drop = FALSE]) > 0 fake.meta[nhood.gr.cells, "Nhood.Group"] <- ifelse(is.na(fake.meta[nhood.gr.cells, "Nhood.Group"]), nhood.gr[i], NA) }

here, after every iteration, the "Nhood.Group" for cells from overlapping nhoods gets overwritten, making the final "Nhood.Group" dependent on the ordering of nhood.gr, everything else being fixed.

Is this not going to affect the results ?

Thanks

MikeDMorgan commented 7 months ago

Hi @gianfilippo this is a necessary evil as cells can be members of multiple groups and nhoods. Therefore the GLM used for DGE cannot have the same cells included multiple times as it would violate the independence assumption. Even with multiple cells contributing to the mean expression of several nhoods this would artificially lower the variance of the mean. In practise, the overwriting of nhood group assignment for cells likely has little impact, though we have not tested this rigorously.

gianfilippo commented 7 months ago

Thanks!