I noticed after running segtools-signal-distribution (v1.1.2) on a segway
segmentation in linux, in the file signal_distribution.stats.tab, the
calculated standard deviation was very strongly dependent on the mean through a
non-linear function.
Using the file segtools_distribution.tab, I found that the calculated mean was
correct, but the standard deviations were incorrect.
I tracked this down to line 352 in signal_distribution.py, where I believe that
edges[:1] is used (which always evaluates to 0) instead of edges[:-1] (as in
the mean calculation).
Also, this formula (when corrected) appears to calculate the sample variance
instead of the sample standard deviation (which requires the square root).
As an additional comment, I have previously found it useful to transform the signal data using a logarithmic or asinh function, so that the distribution is more normal. Could this be included as an option, as it might alter the clustering of the states?
Original issue reported on code.google.com by stevenpwilder@gmail.com on 12 Jan 2011 at 4:19
Original issue reported on code.google.com by
stevenpwilder@gmail.com
on 12 Jan 2011 at 4:19