Closed SIDSSIDS closed 2 years ago
Thank you, I will have a look. 4k does not appear large, but it seems I underestimate how nasty these datasets can be. Using a stack seems fine to me. For efficiency, I'd avoid the nested data structure, but use an IntegerArray, and push/poll 4 values at a time maybe. This should reduce memory allocations substantially and improve performance.
Do you have an interesting test case for benchmarking?
I'd avoid the nested data structure, but use an IntegerArray
By nested data structure you mean constructions like new int[]{i, tri.ca, tri.c, tri.a}
?
I'm not sure, that this kind of array initialization is slower than the following:
int[] a = new int[4];
a[0] = i;
a[1] = tri.ca;
a[2] = tri.c;
a[3] = tri.a;
If so, can you give me a hint why?
Do you have an interesting test case for benchmarking?
I have only the test case, which has led to the problem. alpha_points.csv You can reproduce the error with the default value of alpha
java -Xss512k -jar elki-bundle-0.7.6-SNAPSHOT.jar KDDCLIApplication -dbc.in alpha_points.csv -algorithm algorithm.NullAlgorithm -resulthandler visualizer
and pick DoubleVector
-> Scatterplot
-> Alpha Shape
Using a single IntegerArray
array as stack (a wrapped int[]
), and the four values stored at offsets (i<<2) + j
.
See 6465675 for the resulting code.
This is based on your contribution, although I used a different data structure that uses less memory. Its not pretty code, because the four array adds are unrolled. I could have used an "add4" method, which likely would be inlined and hence not make much of a difference.
Thank you for the test case for reproducing the error (as this allowed me to check it is resolved).
Replaced usage of recursion with a stack in the alpha shape computation. Recursion led to StackOverflow exception on big datasets (4646 points in my case)