Closed RubD closed 4 years ago
Dear RubD, this function is meant to do something. You call this function and ask it not to do what it meant to do and then you say it does not work.
Ok, I see your point, but I should have explained the problem better. What if this function is part of a larger function? Where the user perhaps does not wish to scale and center their matrix, but wants to execute the rest of the analysis?
Right now, this function gives unusual errors and can even have dangerous behaviour as I will explain below:
problem 1:
x = matrix(rnorm(100), 10, 10)
test1 = standardise(x, F, F)
This will result in an error that most people will not expect:
Error in standardise(x, F, F) : object 'y' not found
problem 2:
y = matrix(rnorm(100, mean = 10), 10, 10)
# unrelated matrix
test2 = standardise(x, F, F)
identical(y, test2)
This is potentially a bigger problem, because now you don't get an error but your function has silently replaced the outcome with a previously declared y object.
So by just making a small modification (at not cost) there wouldn't be any problems to begin with, however my intention was just to help and I'm not going to argue any further, so you can just close this again.
Thanks for your time, Ruben
In that case, you do not want to standardise the data. So, if the argeuments are "F" and "F" just put an "if" condition right before calling our function. What we offer is bare bone functions. We cannot add inside every case or every request each person has. Do not take this offensively, and if you think I am offensive I apologize. Truth be told though, that we focus on speed, hence we expect the user to have some knowledge of what he wants. We squeeze the orange a give its orange juice, we do not peel off the orange. Again, no offence.
Michail
Michael's standardise does exactly what R's scale. You can check it on your own even for the case you suggest us.
I just want to quickly comment, because this statement is not correct or at least has the caveats that I mentioned before. You can put an if statement before standardise like statlink suggested, but you can not simply replace base::scale with Rfast::standardise, for exactly the problems I have demonstrated above and below (R-3.6.3 and Rfast 1.9.9):
library(Rfast)
x = matrix(rnorm(100), 10, 10)
test1 = standardise(x, F, F) # object 'y' not found error
identical(x, test1) # does not work, because test1 is not found
test1sc = scale(x, F, F)
identical(x, test1sc) # TRUE, scale returns original matrix x as expected
y = matrix(rnorm(100, mean = 10), 10, 10) # unrelated matrix
test2 = standardise(x, F, F)
identical(y, test2) # TRUE, and this is potentially dangerous if people are not aware of this issue
test2sc = scale(x, F, F)
identical(y, test2sc) # FALSE
identical(x, test2sc) # TRUE
Bottomline, this is some quirky behavior and this function can not simply replace base::scale without providing additional if-statements. In this case a simple additional else statement at the end of your code would fix these issues without any added cost in terms of time or efficiency. However, as I said before, this was just a mere suggestion and if you don't agree then I'm also fine with that.
Keep up the great work, Ruben
Ruben we cannot check for all the possibilities. For example I never check if an argument is NULL. If we remove these checks, even these small checks, the functions can be faster because we are in an interpreted environment.
fixed the situation where both scale and center are FALSE