:exclamation: This is a read-only mirror of the CRAN R package repository. bestNormalize — Normalizing Transformation Functions. Homepage: https://petersonr.github.io/bestNormalize/, https://github.com/petersonR/bestNormalize
1
stars
1
forks
source link
Replicate pearson.test results returned by bestNormalize #2
I find it difficult to understand how the function bestNormalize calculates the pearson.test to transformations and chooses the best normalized vector.
Please consider the following code:
` # create data
set.seed(123)
library(bestNormalize)
library(nortest)
x <- rgamma(100, 1, 1)
I find it difficult to understand how the function bestNormalize calculates the pearson.test to transformations and chooses the best normalized vector. Please consider the following code: ` # create data set.seed(123) library(bestNormalize) library(nortest) x <- rgamma(100, 1, 1)
apply bestNormalize
res <- bestNormalize(x)
chosen transformation is box cox
res$chosen_transform Standardized Box Cox Transformation with 100 nonmissing obs.: Estimated statistics:
sd (before standardization) = 1.00397 res$norm_stats["boxcox"] boxcox 1.002667
fail to replicate pearson.test results
p <- pearson.test(res$x.t) p$statistic/p$df P 0.66
create own norm_stat_fn (equal to that used by bestNormalize)
my_norm_stat_fn <- function(x) { p <- nortest::pearson.test(x) unname(p$stat/p$df) }
apply bestNormalize
res <- bestNormalize(x, norm_stat_fn = my_norm_stat_fn)
chosen transformation is Yeo-Johnson
res$chosen_transform Standardized Yeo-Johnson Transformation with 100 nonmissing obs.: Estimated statistics:
sd (before standardization) = 0.2235039 res$norm_stats["yeojohnson"] yeojohnson 0.9466667
fail to replicate my_norm_stat_fn results
p <- my_norm_stat_fn(res$x.t); p [1] 1.024
create own norm_stat_fn (different from that used by bestNormalize)
my_norm_stat_fn <- function(x) { p <- nortest::pearson.test(x) unname(p$stat) }
apply bestNormalize
res <- bestNormalize(x, norm_stat_fn = my_norm_stat_fn)
chosen transformation is sqrt_x
res$chosen_transform Standardized sqrt(x a) Transformation with 100 nonmissing obs.: Relevant statistics:
sd (before standardization) = 0.4479005 res$norm_stats["sqrt_x"] sqrt_x 3.344
fail to replicate my_norm_stat_fn results
p <- my_norm_stat_fn(res$x.t); p [1] 9.46
create own norm_stat_fn: assign same value to all transformations
my_norm_stat_fn <- function(x) { p <- 2 return(p) }
apply bestNormalize
res <- bestNormalize(x, norm_stat_fn = my_norm_stat_fn)
chosen transformation is asinh
res$chosen_transform Standardized asinh(x) Transformation with 100 nonmissing obs.: Relevant statistics:
sd (before standardization) = 0.5410448
replicate my_norm_stat_fn results
p <- my_norm_stat_fn(res$x.t); p [1] 2
observe that custom norm_stat_fn works only if the name of its argument is "x"
my_norm_stat_fn <- function(m) { p <- nortest::pearson.test(m) unname(p$stat) }
apply bestNormalize
res <- bestNormalize(x, norm_stat_fn = my_norm_stat_fn) Error in (function (m) : unused argument (x = c(-0.742466592984817, -0.563796587066655, -1.31311589796953, -0.602562318313787, 3.15457570509193, -0.419595958433174, -0.153341073734975, -0.680628813365879, -0.465967563440651, 1.69922890785271)) `