imglib / imglib2

A generic next-generation Java library for image processing
http://imglib2.net/
Other
294 stars 93 forks source link

CellRandomAccess unexpectedly loads neighboring cells #252

Open igorpisarev opened 5 years ago

igorpisarev commented 5 years ago

Let's say a CellImg is defined with cell size [2,2,2], and we want to iterate from min=[4,4,4] to max=[5,5,5] in this image. This interval covers exactly one cell at grid position [2,2,2]. However, looping over this interval using

Views.iterable(Views.interval(cellImg, cell)).cursor()

unexpectedly leads to loading 3 extra cells:

This behavior becomes problematic in a workflow when the cells are read and written concurrently as it subtly introduces race conditions.

I also created an example demonstrating this behavior using N5 as a cell storage backend: https://gist.github.com/igorpisarev/591212a26228e4ac7cb482b0134217a1

hanslovsky commented 5 years ago

Thanks for raising this issue @igorpisarev. I talked to @axtimwalde earlier and he said that he was having a look at the CellRandomAccess. Essentially, a cell should only be loaded as necessary on CellRandomAccess.get() and not on any CellRandomAccess.setPosition() or related methods.

tischi commented 3 years ago

@axtimwalde @tpietzsch

I think I found another place where this hits one quite badly. It is in the StackView class:

        public DefaultRA( final RandomAccessibleInterval< T >[] slices, final Interval interval )
        {
            n = slices[ 0 ].numDimensions() + 1;
            sd = n - 1;
            slice = 0;
            tmpLong = new long[ sd ];
            tmpInt = new int[ sd ];
            sliceAccesses = new RandomAccess[ slices.length ];
            if ( interval == null )
            {
                for ( int i = 0; i < slices.length; ++i )
                    sliceAccesses[ i ] = slices[ i ].randomAccess();
            }
            else
            {
                final long[] smin = new long[ sd ];
                final long[] smax = new long[ sd ];
                for ( int d = 0; d < sd; ++d )
                {
                    smin[ d ] = interval.min( d );
                    smax[ d ] = interval.max( d );
                }
                final Interval sliceInterval = new FinalInterval( smin, smax );
                for ( int i = 0; i < slices.length; ++i )
                    sliceAccesses[ i ] = slices[ i ].randomAccess( sliceInterval );
            }
            sliceAccess = sliceAccesses[ slice ];
        }

Just for getting one random access to one slice of the data, below loop "touches" all slices:

for ( int i = 0; i < slices.length; ++i )
    sliceAccesses[ i ] = slices[ i ].randomAccess();

The issue is that with the current implementation of randomAccess() this triggers the cellLoader to load some data such that it loads data from all slices, which can be a real problem if data access is expensive.