Closed tpietzsch closed 8 months ago
@tpietzsch This is awesome! Too bad about the performance, but still good to have. :grinning:
About the ImgLibStream
: I think this idea is almost necessary to do, because otherwise people will definitely bump into proxy-type-object-reuse-related bugs. I'm less convinced that you need a public class wrapper, though—it could instead be only an internal Stream
subclass that overrides methods as appropriate while adding no new API. If we take care to override most/all of the potential pain points, the need for a method like materialize()
becomes less. Are there other new API methods that occurred to you besides those you mentioned above?
For the localizable stream elements: I like this idea. The method could just be .localizingStream()
for symmetry with localizingCursor()
, eh? Although I guess we probably also want .localizingParallelStream()
:roll_eyes: ... But then as you say, the generics get tough. Maybe instead of baking it into the IterableRealInterval
interface, some static utility methods would be easier? Like:
public static < T > Stream< RealCursor< T >> localizingStream( IterableRealInterval< T > iri ) { ... }
public static < T > Stream< Cursor< T >> localizingStream( IterableInterval< T > ii ) { ... }
This avoids the hairiness of incompatible return type of an overridden method in IterableInterval
due to non-covariance.
And the code could read almost as nicely:
Img< DoubleType > myImg = ...;
List< Double > valuesPast123 = Streams.localizing( myImg )
.filter( c -> c.getDoublePosition( 0 ) > 123.0 )
.map( c -> c.get().getRealDouble() )
.collect( Collectors.toList() );
@ctrueden I made separate issues for the wrapper classes https://github.com/imglib/imglib2/issues/339, and the localizing streams https://github.com/imglib/imglib2/issues/338, and replied there
This PR adds
IterableRealInterval.stream()
and.parallelStream()
default methods to access the pixel values in an image as aStream<T>
.The stream methods rely on a default implementation of
IterableRealInterval.spliterator()
backed byRealCursor
.Encounter order of the streams matches that of cursors, i.e.
Views.flatIterable(img).stream()
yields elements in flat iteration order.Usage examples:
true
pixels in a binary imagePitfalls
Note that the
T
elements of the stream are proxies and reused (as usual). TheRealCursorSpliterator
implementation takes care that a new proxy is used for each split-off prefix, soparallelStream()
works as expected. However, explicit copying operations must be added, if stream elements are supposed to be retained (by stateful intermediate or terminal operations).For example, to collect all
DoubleType
values between0
and1
into a list:The
.map(DoubleType::copy)
operation is necessary, otherwise thevalues
list will contain many duplicates of the sameDoubleType
object (which may not even have to a value between0
and1
). The copy could also be done before the.filter(...)
operation, but it's better to do it as late as possible to avoid unnecessary creation of objects.Performance
Initial benchmarks show that using streams (even without copying) is a lot slower than explicit
for
loops, for example.Running the following benchmark
results in
Not ideal... It may be possible to improve performance, but so far I didn't find anything that works.
However, I think this is anyway more a quality-of-life feature. (Like the
RandomAccessible.getAt(...)
convenience methods (https://github.com/imglib/imglib2/pull/246) which I find myself using more often then I expected, despite the performance overhead.)Ideas
There is more to explore in this direction.
Stream<LocalizableSampler<T>>
instead ofStream<T>
. The implementation would be simple, basically just useCursor<T>
instead ofCursor<T>.get()
for the stream elements. However, the generics are a bit hairy.ImgLibStream
wrapper aroundjava.util.stream.Stream
? With this we could add additional operations, for example the previously mentioned.map(DoubleType::copy)
could be.materialize()
. We could decorate operations like.distinct()
and.sort()
to make sure that copies have been made before, etc.Spliterator
s improve performance over iterators, for example forCellImg
?Views
/Converters
framework?