If a data set is very large (say over maybe 10,000), rather than go through all
points to find mean and especially standard deviation, the program should
sample at most 10,000 random points from the data set to do so.
Mean might be left going through all since it can be computed with min and max,
but standard deviation greatly slows down the start of an SOM calculation with
large data sets (it is used for the default variance normalization).
Original issue reported on code.google.com by kyle.tha...@gmail.com on 3 Jun 2011 at 3:19
Original issue reported on code.google.com by
kyle.tha...@gmail.com
on 3 Jun 2011 at 3:19