Closed bc2zb closed 8 years ago
The scale columns argument scales the distributions of those parameters to have a mean value of zero and a standard deviation of 1. This scaling happens after any transformation that you may have specified (e.g. the arcsinh transformation).
This feature was added to deal with circumstances where certain parameters/channels may have substantially different dynamic ranges and those parameters/channels with larger absolute values tend to overly-influence clustering. By scaling all parameters/channels to have the same mean and standard deviation, you reduce the chance that clustering would be overly-influenced by one channel. However, I've not done any systematic evaluation on the general effect of scaling channels and there could be some unanticipated / undesirable effects associated with scaling channels.
I tried looking at the code and cannot seem to find what the scaleColumns argument is doing. I ask because I ran my experiment with and without a list of columns to scale with and the results change.