imglib / imglib2

A generic next-generation Java library for image processing
http://imglib2.net/
Other
302 stars 93 forks source link

Building a kdTree is slow for repeated values #326

Closed tischi closed 6 months ago

tischi commented 1 year ago

To reproduce:

        Random r = new Random();
        List< RealPoint > points = new ArrayList<>();
        for ( int i = 0; i < 100000; i++ )
            points.add( new RealPoint( r.nextDouble(), r.nextDouble(), r.nextDouble() ) );

        long start = System.currentTimeMillis();
        new KDTree<>( points, points );
        System.out.println( "Built tree in " + ( System.currentTimeMillis() - start ) );

        // repeated values along 3rd dimension
        points = new ArrayList<>();
        for ( int i = 0; i < 100000; i++ )
            points.add( new RealPoint( r.nextDouble(), r.nextDouble(), 1.0 ) );

        start = System.currentTimeMillis();
        new KDTree<>( points, points );
        System.out.println( "Built tree with repeated values in " + ( System.currentTimeMillis() - start ) );

yields

Build tree in 337
Built tree with repeated values in  10545

Note that this can be fixed in a hacky way by adding some noise to the dimension with the repeated values. But this is not ideal, because the acceptable noise level may be very use-case specific.

tpietzsch commented 5 months ago

This is fixed in the revised KDTree (#333, released with imglib2-6.4.0).