mkodekar / guava-libraries

Automatically exported from code.google.com/p/guava-libraries
Apache License 2.0
0 stars 0 forks source link

BufferedIterator #318

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
As I mean it, a BufferedIterator as an Iterator where the next N elements
are eagerly fetched on a background thread.

The Guava interface might be:
Iterators.buffer(Iterator toBuffer, int bufferSize)
or
Iterators.buffer(Iterator toBuffer, int bufferSize, ExecutorService
executorService)

See here for an implementation:
http://stackoverflow.com/questions/2149244/bufferediterator-implementation

Original issue reported on code.google.com by medo...@gmail.com on 28 Jan 2010 at 6:52

GoogleCodeExporter commented 9 years ago

Original comment by fry@google.com on 26 Jan 2011 at 10:39

GoogleCodeExporter commented 9 years ago

Original comment by kevinb@google.com on 13 Jul 2011 at 6:18

GoogleCodeExporter commented 9 years ago

Original comment by cpov...@google.com on 13 Jul 2011 at 8:40

GoogleCodeExporter commented 9 years ago

Original comment by fry@google.com on 10 Dec 2011 at 3:43

GoogleCodeExporter commented 9 years ago
You might already have this ready. And if you don't, Kevin's g+ points about 
the fruitlessness of providing an implementation surely apply :)

Still, I implemented this yesterday and you might have interest in comparing 
notes. I sure do.

- I catch RuntimeException from the source iterator and pass through the buffer 
to be thrown eventually. I let Error percolate up to user Thread's 
UncaughtExceptionHandler.
- If internal thread is interrupted, it clears the buffer and signals end of 
data. If consuming thread is interrupted, the internal thread is interrupted.
- Internal thread started on first use, or manually by user to start buffering 
before then. It can be destoyed manually.

I stuck mine here: 
http://code.google.com/p/brianfromoregon/source/browse/trunk/buffering/src/main/
java/BufferedIterator.java
With a test here: 
http://code.google.com/p/brianfromoregon/source/browse/trunk/buffering/src/test/
java/BufferedIteratorTest.java

Original comment by brianfromoregon on 11 Dec 2011 at 3:21

GoogleCodeExporter commented 9 years ago
I've toyed with similar things. This in particular seems a little less clear to 
what the point is. What I've done, for peers asking for this, is constructs 
like 'take an iterator, apply concurrently a transformation (potentially I/O, 
thus there should be some potential concurrency), and expose the results as 
another iterator'. Thus, single producer, single consumer (otherwise iterators 
don't cut it), with a 'doParallel' transformation in the middle (and the 
resulting iterator is inan arbitrary order, of course). By the way, accepting 
an Executor is much preferable to starting a thread.

The interesting thing is how an iterator translates to a blocking queue with an 
end_of_stream marker, and that queue translates to an iterator which doesn't 
need the marker since it has hasNext().

Perhaps this is more useful if one exposed the queue, instead of wraping it as 
an iterator - this allows multiple consumers downstream. And an 'end of stream' 
can be avoided by using some sort of counter (like the very nice, internal, 
IncrementableCountDownLatch), and expose a thread-safe 'producerIsDone()', 
perhaps in a BlockingQueue subtype.

Just ideas for now.

Original comment by jim.andreou on 11 Dec 2011 at 6:35

GoogleCodeExporter commented 9 years ago
I've found the doParallel construct useful too, I'm using a half baked impl 
over here 
http://code.google.com/p/photomosaic/source/browse/trunk/photomosaic/src/main/ja
va/net/bcharris/photomosaic/ThreadedIteratorProcessor.java It's useful standing 
on its own and a BufferedIterator would be too, and it might be nice to layer 
them. My current use case is streaming results of serial data store queries, 
the point being prefetching results to avoid waiting on network I/O for each 
query. 

Other markers in the queue are exception and null_element, among them 
end_of_stream might be the easiest to fit to an exposed BlockingQueue. 

Original comment by brianfromoregon on 11 Dec 2011 at 10:40

GoogleCodeExporter commented 9 years ago
I didn't really pay attention to this issue (#318), since the issue explicitly 
describes a background thread, but still, even for one callback, Executor still 
makes sense. 

Btw, regarding exceptions, a nice, decoupled way to tackle the issue is to just 
use Future<V> instead of V. E.g., if the user wants to deal with asynchronous 
exceptions, he would pass an Iterator<Future<V>> instead. And if you had 
something like I said, a BlockingQueue of "processed tasks", this could be 
represented by Future<V>/ListenableFuture<V> - thus no need for something like 
an "exception" marker as well.

(No time to delve into details right now, sorry)

Original comment by jim.andreou on 11 Dec 2011 at 11:36

GoogleCodeExporter commented 9 years ago

Original comment by fry@google.com on 16 Feb 2012 at 7:17

GoogleCodeExporter commented 9 years ago

Original comment by kevinb@google.com on 30 May 2012 at 7:43

GoogleCodeExporter commented 9 years ago

Original comment by kevinb@google.com on 22 Jun 2012 at 6:16

GoogleCodeExporter commented 9 years ago
@Jim regarding our discussion earlier about parallel processing an iterator 
look at this awesome interface Doug put up last week (and how perfect is the 
name!!) 
http://gee.cs.oswego.edu/dl/jsr166/dist/jsr166edocs/jsr166e/ConcurrentHashMapV8.
Spliterator.html

Original comment by brianfromoregon on 10 Jul 2012 at 10:41

GoogleCodeExporter commented 9 years ago
This issue has been migrated to GitHub.

It can be found at https://github.com/google/guava/issues/<id>

Original comment by cgdecker@google.com on 1 Nov 2014 at 4:16

GoogleCodeExporter commented 9 years ago

Original comment by cgdecker@google.com on 1 Nov 2014 at 4:19

GoogleCodeExporter commented 9 years ago

Original comment by cgdecker@google.com on 3 Nov 2014 at 9:10