covartech / PRT

Pattern Recognition Toolbox for MATLAB
http://covartech.github.io/
MIT License
144 stars 70 forks source link

Slowdown in new feature name calculations. #11

Closed peterTorrione closed 11 years ago

peterTorrione commented 11 years ago

I have two data sets, of equal size. One I made by doing 3 pre-processing stages to an original data set, and one I just made with prtDataSetClass.

I run:

tic; rt(prtPreProcZmuv + prtPreProcEnergyNormalizeRows,dsSynthetic); toc;

On the data I just made, and I get:

Elapsed time is 0.544173 seconds.

When I run the same code on the data I made with pre-processing, I get:

Elapsed time is 2.422874 seconds.

I'm doing this in a loop, and this time is killing me.

All of the time is being eaten up in

function self = modifyNonDataAttributesFrom(self, action)

in prtDataSetStandard, specifically line 509:

self.featureNameModificationFunction = @(nameIn, index)modFun(self.featureNameModificationFunction(nameIn, index),index);

My hypothesis is that this will keep getting slower as you add more and more blocks together?

newfolder commented 11 years ago

This is bad…. We need a solution.

On Oct 27, 2012, at 2:47 PM, Peter Torrione notifications@github.com wrote:

I have two data sets, of equal size. One I made by doing 3 pre-processing stages to an original data set, and one I just made with prtDataSetClass.

I run:

tic; rt(prtPreProcZmuv + prtPreProcEnergyNormalizeRows,dsSynthetic); toc;

On the data I just made, and I get:

Elapsed time is 0.544173 seconds.

When I run the same code on the data I made with pre-processing, I get:

Elapsed time is 2.422874 seconds.

I'm doing this in a loop, and this time is killing me.

All of the time is being eaten up in

function self = modifyNonDataAttributesFrom(self, action)

in prtDataSetStandard, specifically line 509:

self.featureNameModificationFunction = @(nameIn, index)modFun(self.featureNameModificationFunction(nameIn, index),index);

My hypothesis is that this will keep getting slower as you add more and more blocks together?

— Reply to this email directly or view it on GitHub.

peterTorrione commented 11 years ago

Here's an example, it's not as pronounced as above, but I think it illustrates the point:

ds = prtDataGenUnimodal;

algoPre = prtPreProcZmuv + prtPreProcHistEq + prtPreProcPca + prtPreProcLda + prtPreProcZmuv + prtPreProcHistEq; dsPre = rt(algoPre,ds);

classAlgo = prtPreProcZmuv + prtClassLibSvm; tic; yOut = kfolds(classAlgo,dsPre,10); toc;

tic; yOut = kfolds(classAlgo,ds,10); toc;

kennethmorton commented 11 years ago

I pushed a fix this morning. It seems to help. For the above example execution time for the upper tic/toc is cut in half using the new method. Everything should be backwards compatible so there is no need to change the dataset version number.

Kenny

peterTorrione commented 11 years ago

This seems fixed; but see new issue.