MarioniLab / MammaryGland

7 stars 2 forks source link

Calculating quality control metrics #8

Open LTLA opened 7 years ago

LTLA commented 7 years ago

In QCAnalysis.R; just use the calculateQCMetrics function in scater.

LTLA commented 7 years ago

You should use the plotHighestExprs function in scater to plot the highest-expressing genes.

LTLA commented 7 years ago

The concept of a left MAD is unusual. In messy cases with lots of low-quality cells, the left tail will be heavier after log-transformation, so the left MAD will be larger than the standard MAD; this will make it less stringent at cutting out left-side outliers. You're also missing the magic rescaling constant (1.4826) to make the MAD an unbiased estimator of the population standard deviation. This will alter the interpretation of the stringency of the "4-MAD" threshold; see mad for details.

LTLA commented 7 years ago

Filtering on gene abundances has no context. At least 10 cells could be very relaxed or very stringent, depending on the total number of cells. Same with 50*isexpThreshold, the reasoning for which is not intuitive to me. Better to use thresholds based on proportion of cells expressing and row means.