Closed AnjaliC4 closed 2 years ago
Huh, I totally get why this is confusing / seems arbitrary. There is really not "magic" here, rather, this is a hacky workaround for an issue I encountered during development:
Aggregation might use different summary statistics (say, sum or mean or median) and different assay data (say, counts or expression-like values). Meanwhile, edgeR
's filterByExpr()
is designed for count-like data... So the > 100
check is hoping to check "Do these look like counts?" (Well, sum of single-cell counts, really) Before having this in place, filterByExpr()
would remove everything when aggregateData()
had been called with, for example, mean of logcounts... Hope that makes sense!
Hi, I had a simple question: In pbDS function -> line 156: if (filter %in% c("genes", "both") & max(assay(y, k)) > 100) can you please explain what is the purpose of setting the check of max counts > 100 for filtering genes with filterbyexp. Curious because I didn't find this criteria in edgeR/limma manual. I am sure you guys have set this for a good reason -just would like to know your reasoning for clarification because for clusters where counts are less than 100 - this won't allow filterbyexp.
Thanks.