HelenaLC / CATALYST

Cytometry dATa anALYsis Tools
67 stars 30 forks source link

Normalize based on the 99.99th percentile #330

Closed cr2106 closed 1 year ago

cr2106 commented 1 year ago

Hi!

After arcsinh normalization of flow data using prepData, is it possible to normalize based on the 99.99th percentile to have everything between 0 and 1?

Thank you very much for your work!

HelenaLC commented 1 year ago

Jup, should be easy using base R commands... Luckily, there's a little helper in CATALYST that is used for some heatmap visualizations, which you could use via CATALYST:::.scale_exprs (here, x is a matrix of expression values, margin specifies whether to scale rows (1) or columns (2), and q defines the quantiles). You'd then assign the output of this to a new assay, e.g., assay(sce, "scaled") <- CATALYST:::.scale_exprs(assay(sce, "exprs"), ...).

# CATALYST:::.scale_exprs
function (x, margin = 1, q = 0.01) {
    if (!is(x, "matrix")) 
        x <- as.matrix(x)
    qs <- c(rowQuantiles, colQuantiles)[[margin]]
    qs <- qs(x, probs = c(q, 1 - q))
    qs <- matrix(qs, ncol = 2)
    x <- switch(margin, `1` = (x - qs[, 1])/(qs[, 2] - qs[, 1]), 
        `2` = t((t(x) - qs[, 1])/(qs[, 2] - qs[, 1])))
    x[x < 0 | is.na(x)] <- 0
    x[x > 1] <- 1
    return(x)
}
Ivy-ops commented 9 months ago

Hi @HelenaLC , Thanks for the nice package, which is pretty helpful to the community. I have searched the issue related to .scale_exprs() and found this post, thus am asking here. Do you have any specific reason for the last 2 lines of code for covert values greater than 1 into 1 and less than 0 or is NA into 0? x[x < 0 | is.na(x)] <- 0 x[x > 1] <- 1

Thank you!

HelenaLC commented 8 months ago

I'd say that

SamGG commented 8 months ago

I fully agree. Say differently, we want the color range to reflect the intensity range for each marker and accept to ignore the lower and higher percentiles (q being 1% as default). Ignoring values outside the [0..1] range could be achieved via thresholding the data as performed in .scale_exprs(). Alternatively, pheatmap allows to assign limits to the color scale (ComplexHeatmap may provide this also), which results in the same display. Best.