mauricio / postgresql-async

Async, Netty based, database drivers for PostgreSQL and MySQL written in Scala
Apache License 2.0
1.43k stars 222 forks source link

Reactive streams support for large datasets #146

Open antonzherdev opened 9 years ago

antonzherdev commented 9 years ago

If a query returns a large dataset, it would be nice to have a possibility to process rows one-by-one. Reactive streams seems to be a really good way how to do it. Then you will be able to use Akka Streams or other implementations of reactive streams. I would add in the Connection trait a functions like this:

def streamQuery(query: String): Publisher[RowData]
def streamPreparedStatement(query: String, values: Seq[Any] = List()): Publisher[RowData]

I am not sure whether it is possible or not to implement back pressure and stop temporarily consuming data from database. But even if it is impossible it would be valuable because data could be cached in the case of back pressure.

I could implement this if you do not mind.

mauricio commented 9 years ago

@antonzherdev that would be awesome!

Both database protocols have a "send more" operation when pulling data so this is definitely possible. Let me know if you need any specific help or don't understand something in the code.

antonzherdev commented 9 years ago

I will start implementing this. I can do it only in my free time. So it will take some time.

povder commented 8 years ago

Hi @antonzherdev! Any progress on this? Streaming rows would be awesome. Let me know if you're working on it - if not, maybe somebody else will take over (maybe even me)

antonzherdev commented 8 years ago

Hi @povder, I sent pull requests a year ago but @mauricio did not find time to consider it. You can find the implementation in my fork in the reactive-streams branch.