Original author: dany.Geo...@gmail.com (August 10, 2011 09:24:54)
Using yourkit profiler, I have found "some computation time" spent when offering samples to the statistical processor. (Since I was iterating over million of samples) (see the attached screenshot).
Looking into the calls, a certain amount of time is spent when invoking the generic NumberOperations.calculate method when checking for the range containment (since it needs to instantiate a new number, get a higherclass, get the class type from the map, ...).
Since most part of the computations on range deal with doubles (The ranges provided to abstractProcessor are explicitly Range<Double>), could we define some "specific" management which directly deals with doubles by avoiding any conversion, cast, check for class, new instance creation, ...?
I did some workaround on our 1.2-classified branch to deal with doubles.
Although that isn't the best approach/solution it depicts big performance improvements we may justify some investigations on the topic.
On some sample data I was processing, the stats computations ended after 17 - 18 seconds with the old code. Using the new one, the same tests ends in 10 seconds.
From michael.bedward@gmail.com on September 20, 2012 02:25:21
I think r1870 has included your optimizations Daniele. Please re-open this issue if there is more to do.
Original author: dany.Geo...@gmail.com (August 10, 2011 09:24:54)
Using yourkit profiler, I have found "some computation time" spent when offering samples to the statistical processor. (Since I was iterating over million of samples) (see the attached screenshot).
Looking into the calls, a certain amount of time is spent when invoking the generic NumberOperations.calculate method when checking for the range containment (since it needs to instantiate a new number, get a higherclass, get the class type from the map, ...).
Since most part of the computations on range deal with doubles (The ranges provided to abstractProcessor are explicitly Range<Double>), could we define some "specific" management which directly deals with doubles by avoiding any conversion, cast, check for class, new instance creation, ...?
I did some workaround on our 1.2-classified branch to deal with doubles. Although that isn't the best approach/solution it depicts big performance improvements we may justify some investigations on the topic.
On some sample data I was processing, the stats computations ended after 17 - 18 seconds with the old code. Using the new one, the same tests ends in 10 seconds.
Original issue: http://code.google.com/p/jaitools/issues/detail?id=197