basho / riak-java-client

The Riak client for Java.
Apache License 2.0
267 stars 158 forks source link

Streaming results for List Keys/Buckets, 2i, and MR #677

Closed alexmoore closed 8 years ago

alexmoore commented 8 years ago

This PR adds support for streaming results in:

This was a very interesting exercise in creating a new results return method from top-to-bottom, while not breaking/changing existing APIs.

jbrisbin commented 8 years ago

Looks like a a good first step. IMO it would be good to have as an overall goal to make more use of lambdas, not just as callbacks for asynchronous functions but as a tool in the quiver to reduce garbage. If you follow the code path through a request you see a lot of objects being created to support a single operation. Some of that can't be avoided but there are places I think a lambda-centric approach (versus creating a concrete implementation of some abstract class and instantiating it) would be more efficient.

That said, we're probably not at a level where heap usage, GC, and those other considerations are very high priority.

alexmoore commented 8 years ago

Indeed, I was lucky to get away with only 4 new class files, and I think I did a fairly good job about reuse, but there is always room to improve.

On Thu, Oct 20, 2016 at 9:18 AM, Jon Brisbin notifications@github.com wrote:

Looks like a a good first step. IMO it would be good to have as an overall goal to make more use of lambdas, not just as callbacks for asynchronous functions but as a tool in the quiver to reduce garbage. If you follow the code path through a request you see a lot of objects being created to support a single operation. Some of that can't be avoided but there are places I think a lambda-centric approach (versus creating a concrete implementation of some abstract class and instantiating it) would be more efficient.

That said, we're probably not at a level where heap usage, GC, and those other considerations are very high priority.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/basho/riak-java-client/pull/677#issuecomment-255103163, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAFnomhF4vXxMnhl1wJtVbqoswJi1u7ks5q12o8gaJpZM4KbNBC .

lukebakken commented 8 years ago

This all seems fine to a non-Java dev. While I know why the queue and .poll() is used, it seems hacky. Feeble searching for information about netty brought up this SO post which talks about methods to execute callbacks on threads that don't block the IO thread. Maybe something to look into for the future?

alexmoore commented 8 years ago

@jbrisbin @srgg One other thing I hoped you pick on was the unbounded queue we use, and the way I pass in the poll timeout from the outside. What do you think about these two designs?

If we used a bounded queue, it would limit Java memory usage if the user isn't consuming the iterator, but then the network buffers would bulge instead, since Riak just sends data like a firehose.

With passing the poll timeout, I was thinking we might also be able to do some sort of devolving timeout based on success rate of the last few polls and their timeouts, but not sure how much extra work that would be.

jbrisbin commented 8 years ago

@a1 We could have a variant where you pass in the Queue to use, which could be any implementation, including one with blocking publish semantics to back up the TCP end in case the queue is not being emptied fast enough.

The read timeout should still probably be configured externally since that's a consideration that takes place inside our code.

alexmoore commented 8 years ago

@jbrisbin Actually now I remember why I made it unbounded. If we block on the Netty thread for any reason it will throw a BlockingOperationException, which will kill the whole operation. Very bad for large streaming ops because it'll then retry, and fail, and retry and fail, and Riak will then have 3 listing operations running.

Also if we expose a "provide your own Queue", we'll need to expose the type info all the way up (unless it's only exposed at the Operation level, and not the Command level).

alexmoore commented 8 years ago

Ok, that should be all the gold plating I can think of.

alexmoore commented 8 years ago

@srgg FullBucketRead tests pass locally, +1 to 51bff20. Going to review the rest of it now to see if anything jumps out.

srgg commented 8 years ago

+1 to any changes I didn't do.

lukebakken commented 8 years ago

Looks like more code was deleted than written, always a good sign. Tests pass with Riak 2.2.0, 2.1.4, and 2.0.7 (except one YZ test there). 👍