Closed peterTorrione closed 11 years ago
This is bad…. We need a solution.
On Oct 27, 2012, at 2:47 PM, Peter Torrione notifications@github.com wrote:
I have two data sets, of equal size. One I made by doing 3 pre-processing stages to an original data set, and one I just made with prtDataSetClass.
I run:
tic; rt(prtPreProcZmuv + prtPreProcEnergyNormalizeRows,dsSynthetic); toc;
On the data I just made, and I get:
Elapsed time is 0.544173 seconds.
When I run the same code on the data I made with pre-processing, I get:
Elapsed time is 2.422874 seconds.
I'm doing this in a loop, and this time is killing me.
All of the time is being eaten up in
function self = modifyNonDataAttributesFrom(self, action)
in prtDataSetStandard, specifically line 509:
self.featureNameModificationFunction = @(nameIn, index)modFun(self.featureNameModificationFunction(nameIn, index),index);
My hypothesis is that this will keep getting slower as you add more and more blocks together?
— Reply to this email directly or view it on GitHub.
Here's an example, it's not as pronounced as above, but I think it illustrates the point:
ds = prtDataGenUnimodal;
algoPre = prtPreProcZmuv + prtPreProcHistEq + prtPreProcPca + prtPreProcLda + prtPreProcZmuv + prtPreProcHistEq; dsPre = rt(algoPre,ds);
classAlgo = prtPreProcZmuv + prtClassLibSvm; tic; yOut = kfolds(classAlgo,dsPre,10); toc;
tic; yOut = kfolds(classAlgo,ds,10); toc;
I pushed a fix this morning. It seems to help. For the above example execution time for the upper tic/toc is cut in half using the new method. Everything should be backwards compatible so there is no need to change the dataset version number.
Kenny
This seems fixed; but see new issue.
I have two data sets, of equal size. One I made by doing 3 pre-processing stages to an original data set, and one I just made with prtDataSetClass.
I run:
tic; rt(prtPreProcZmuv + prtPreProcEnergyNormalizeRows,dsSynthetic); toc;
On the data I just made, and I get:
Elapsed time is 0.544173 seconds.
When I run the same code on the data I made with pre-processing, I get:
Elapsed time is 2.422874 seconds.
I'm doing this in a loop, and this time is killing me.
All of the time is being eaten up in
function self = modifyNonDataAttributesFrom(self, action)
in prtDataSetStandard, specifically line 509:
self.featureNameModificationFunction = @(nameIn, index)modFun(self.featureNameModificationFunction(nameIn, index),index);
My hypothesis is that this will keep getting slower as you add more and more blocks together?