Open igorpisarev opened 5 years ago
Thanks for raising this issue @igorpisarev. I talked to @axtimwalde earlier and he said that he was having a look at the CellRandomAccess
. Essentially, a cell should only be loaded as necessary on CellRandomAccess.get()
and not on any CellRandomAccess.setPosition()
or related methods.
@axtimwalde @tpietzsch
I think I found another place where this hits one quite badly. It is in the StackView
class:
public DefaultRA( final RandomAccessibleInterval< T >[] slices, final Interval interval )
{
n = slices[ 0 ].numDimensions() + 1;
sd = n - 1;
slice = 0;
tmpLong = new long[ sd ];
tmpInt = new int[ sd ];
sliceAccesses = new RandomAccess[ slices.length ];
if ( interval == null )
{
for ( int i = 0; i < slices.length; ++i )
sliceAccesses[ i ] = slices[ i ].randomAccess();
}
else
{
final long[] smin = new long[ sd ];
final long[] smax = new long[ sd ];
for ( int d = 0; d < sd; ++d )
{
smin[ d ] = interval.min( d );
smax[ d ] = interval.max( d );
}
final Interval sliceInterval = new FinalInterval( smin, smax );
for ( int i = 0; i < slices.length; ++i )
sliceAccesses[ i ] = slices[ i ].randomAccess( sliceInterval );
}
sliceAccess = sliceAccesses[ slice ];
}
Just for getting one random access to one slice of the data, below loop "touches" all slices:
for ( int i = 0; i < slices.length; ++i )
sliceAccesses[ i ] = slices[ i ].randomAccess();
The issue is that with the current implementation of randomAccess()
this triggers the cellLoader to load some data such that it loads data from all slices, which can be a real problem if data access is expensive.
Let's say a
CellImg
is defined with cell size[2,2,2]
, and we want to iterate frommin=[4,4,4]
tomax=[5,5,5]
in this image. This interval covers exactly one cell at grid position[2,2,2]
. However, looping over this interval usingunexpectedly leads to loading 3 extra cells:
This behavior becomes problematic in a workflow when the cells are read and written concurrently as it subtly introduces race conditions.
I also created an example demonstrating this behavior using N5 as a cell storage backend: https://gist.github.com/igorpisarev/591212a26228e4ac7cb482b0134217a1