HenrikBengtsson / matrixStats

R package: Methods that Apply to Rows and Columns of Matrices (and to Vectors)
https://cran.r-project.org/package=matrixStats
203 stars 33 forks source link

PERFORMANCE: Use sort.int() instead of generic sort() #155

Closed HenrikBengtsson closed 5 years ago

HenrikBengtsson commented 5 years ago

There's probably room for improvements by replacing generic sort() with sort.int() in few place:

$ grep -F "sort(" R/*.R
R/binMeans.R:#' \code{rev(binMeans(-x, bx = sort(-bx), right = FALSE))}, but is faster.
R/rowQuantiles.R:      partial <- sort(unique(c(idxs_lo, idxs_hi)))
R/rowQuantiles.R:      partial <- sort(unique(c(idxs_lo, idxs_hi)))
R/rowTabulates.R:      values <- sort(values)
R/rowTabulates.R:      values <- sort(values, na.last = TRUE)
R/rowTabulates.R:      values <- sort(values)
R/rowTabulates.R:      values <- sort(values, na.last = TRUE)

$ grep -E "=[ ]*sort" R/*.R
R/binMeans.R:#' \code{rev(binMeans(-x, bx = sort(-bx), right = FALSE))}, but is faster.
R/rowQuantiles.R:      xp <- apply(x, MARGIN = 1L, FUN = sort, partial = partial)
R/rowQuantiles.R:      xp <- apply(x, MARGIN = 2L, FUN = sort, partial = partial)

$ grep -F "sort.int(" R/*.R
R/binCounts.R:  x <- sort.int(x, method = "quick")
R/binMeans.R:  x <- sort.int(x, method = "quick", index.return = TRUE)
R/varDiff.R:    x <- sort.int(x, partial = partial)
R/varDiff.R:    x <- sort.int(x, partial = partial)
R/varDiff.R:    x <- sort.int(x, partial = partial)
R/varDiff.R:    x <- sort.int(x, partial = partial)

Related

This could explain Issue #153.

HenrikBengtsson commented 5 years ago

UPDATE: col/rowQuantiles() now uses sort.int(), which was made possible due to the restriction in Issue #156. This leaves:

$ grep -F "sort(" R/*.R
R/binMeans.R:#' \code{rev(binMeans(-x, bx = sort(-bx), right = FALSE))}, but is faster.
R/rowTabulates.R:      values <- sort(values)
R/rowTabulates.R:      values <- sort(values, na.last = TRUE)
R/rowTabulates.R:      values <- sort(values)
R/rowTabulates.R:      values <- sort(values, na.last = TRUE)

$ grep -E "=[ ]*sort" R/*.R
R/binMeans.R:#' \code{rev(binMeans(-x, bx = sort(-bx), right = FALSE))}, but is faster.
HenrikBengtsson commented 5 years ago

Now using sort.int() instead of generic sort() everywhere.