RxJava-JDBC-like composition

thomasnield commented 8 years ago

Hey @edvin,

Normally I use RxJava-JDBC to do my querying. But there are some situations where I would like to statefully but functionally compose my queries in a non-reactive manner (and not be constrained to monads). I think this library might have some potential if we leverage the Kotlin Sequence to pull items out of a ResultSet.

I think it might be cool if we were able to accomplish something like this, almost identical to RxJava-JDBC but using a Sequence instead:

val db = Database.from(url)
List<String> names = db.
        .select("select name from person where name > ? order by name")
        .parameter("ALEX")
        .getAs(String.class) //returns Sequence<String>
        .toList()

edvin commented 8 years ago

Hm.. that's nice :) By the way, I wrote kdbc in my first day of doing Kotlin or something like that and haven't looked at it since, so there are probably lots of stuff that should be reconsidered here. (Am actually using it in a couple of apps though).

I'll get back to you on this tomorrow, ok Thomas? :)

thomasnield commented 8 years ago

Sounds good! If you want to embark on this I'll definitely contribute.

thomasnield commented 8 years ago

I think we could do a lot with this, and I can take the lead if you are too busy. But just like RxJava-JDBC we could allow automatic database connection pooling with an optional HikariCP dependency, as well as streamline parameters and abstract away PreparesStatements and batching.

It's a medium-sized undertaking but I think it will be useful to us and many other people.

thomasnield commented 8 years ago

Okay, here is how it would be done simply with an extension function. The question is do we want this library to be lightweight extension functions only? Or do we want to do a more ambitious framework with connection pooling, builders, etc... basically re-invent RxJava-JDBC but for Kotlin sequences. I'm not sure.

fun ParameterizedStatement.asSequence(): Sequence<ResultSet> {
    return object: Iterator<ResultSet> {
        val rs = executeQuery()
        var hasMore = true
        init {
            hasMore = rs.next()
        }
        override fun next() = rs

        override fun hasNext():Boolean {
            hasMore = rs.next()
            if (!hasMore)
                rs.close()
            return hasMore
        }

    }.asSequence()
}

Usage

val connection =
    DriverManager.getConnection("jdbc:sqlite:C:/Users/Thomas/OneDrive/Data/finance_rx.db")

val items = connection.query("SELECT * FROM CATEGORY").asSequence()
    .map { it.getString("NAME") }
    .toList()

println(items)

edvin commented 8 years ago

Cool! I'm definitely interested, but I would like to focus on TornadoFX until we feel it is feature complete. I think that won't be long now, btw :) I have some ideas about the query interface etc that I would like to share with you before we start!

thomasnield commented 8 years ago

Yeah, I got to finish the guide too. I'd like to finish all builder chapters by this weekend. No rush, I just had the idea spark from a need I had today. It can wait since this extension will do the job for now.

Whenever you're ready lets resume this conversation later : )

edvin commented 8 years ago

Cool :) I'm also gonna finish that JSF-like thing I talked about. It's working 100% now, pretty decent performance as well - sub millisecond roundtrip from the client to the server including rendering.

thomasnield commented 8 years ago

I hate to distract from TornadoFX, but I've been developing this on my fork to use at work. I'm a little torn whether this should purely be extension functions or have custom domain classes that mimic RxJava-JDBC. I think a middle-of-the-road approach would be ideal since RxJava-JDBC is highly complex with its reactive scheduling, and that overhead is not needed here.

So we can get something to this effect: Put extension functions on Connection and return a SelectBuilder or UpdateBuilder that returns sequences off a PreparedStatement.

val conn: Connection = hikariDataSource.getConnection()

List<String> names = conn
    .select("select name from person where name > ? order by name")
        .parameter("ALEX")
        .getAs(String.class) //returns Sequence<String>
        .toList()

edvin commented 8 years ago

I will look more at this when I get back home, but I have some ideas I just want to get your opinion about.

DataSource extensions

We could support some extension functions on javax.sql.DataSource to borrow a connection and call close() on it after the code block ends. Most connection pools expose a DataSource, so we don't need to think of special connection pool support I think.

Inline Parameters

This following example might look like it's susceptible to SQL injection, but it would under the covers do exactly the same as your example above. It also showcases the borrow extension function on DataSource:

val name = "ALEX"

ds.borrow {
    select { "name from person where name > ${p(name)}" }
    getAs(String::class)
    toList()
}

The select function could optionally append SELECT unless the string starts with it, so you don't have to repeat yourself.

The extension function p would add a positional parameter and return a ?. If select returns some kind of builder object, the parameters could be added there. toList could be available on the PreparedStatement as well, so you don't need to add the getAs call unless you're going to treat it as a Sequence.

Domain Object Mapping

Furthermore, we could support automatic mapping to domain objects that has a constructor which takes a ResultSet, so:

val adults = select { "name * person where age > 18" }
    getAs(Person::class)
    toList()

Would result in Person objects being constructed by calling map { rs -> Person(rs)} for each row, unless you want to do it manually.

There would need to be a toSingle function or something, to return a single instance instead of a List.

SQL Keyword support

It would be nice to have some keywords that can help separate logical parts of the query for easier reading:

select { "name from person where name > ${p(name)}" }
orderBy("name")
limit(25)

They would do no magic, simply add "ORDER BY name" and "LIMIT 25" to the sql statement. It kind of helps with the readability because the query part stands out more. It would also be totally optional to use them.

I don't know if this syntax or these suggestions are preferable, just wanted to throw it out there since it's doable :)

edvin commented 8 years ago

Another inline example with a bit more context:

fun updatePerson(person: Person) = with(datasource) {
    update { "person set name = ${p(person.name)} where id = ${p(person.id)}"
}

Alternative with a bit of magic:

fun updatePerson(person: Person) = with(datasource) {
    update { "person set name = #{person.name} where id = #{person.id}"
}

Also notice that the datasource just got extension functions that borrow AND call update in one go :)

edvin commented 8 years ago

Heck, why repeat the person object all the time:

fun updatePerson(person: Person) = with(datasource) {
    context(person)
    update { "person set name = #{name} where id = #{id}" }
}

edvin commented 8 years ago

The context object could of course be configured with the datasource, so all operations on the connection that was borrowed can look up references in that context object:

fun updatePerson(person: Person) = datasource.with(person) {
    update { "person set name = #{name} where id = #{id}"
}

thomasnield commented 8 years ago

The SQL keyword support... that's some interesting syntactic sugar. Almost reminds of LINQ for some reason. My initial reaction is wary, but for more complicated SQL it actually could be helpful. My concern would be nuances with each SQL platform, but from your examples it looks like that level of detail won't be abstracted.

I've got something here... let me push it up in a moment and we'll merge ideas.

edvin commented 8 years ago

Yeah, I was thinking the same thing about the keywords. Not all databases has a limit keyword, for example, so using it would be just as wrong as putting the string "limit" inside the select statement. Don't think that would be a big problem though :)

thomasnield commented 8 years ago

I think I agree about building around DataSource... I'll switch up a few things with my proof of concept.

thomasnield commented 8 years ago

And the automap is good too. I don't use that feature as much in RxJava-JDBC but I know a lot of people that do.

thomasnield commented 8 years ago

Okay, check my fork. Still wip and my parameter injection is not working, but this is what I'm working towards. The Database class is an abstraction containing a possible pool of connections from a ConnectionProvider. Every time it runs a query it closes the Connection, giving it back to the pool.

This closely mirros David Moten's RxJava-JDBC setup and it is the most fluent pattern I have seen for database querying.

data class Account(val id: Int, val name: String, val accountType: String)

val db = Database.from("jdbc:sqlite:C:/Users/Thomas/OneDrive/Data/finance_rx.db")

db.select("SELECT * FROM ACCOUNT WHERE ID = :id")
    .param("id", 1)
    .get { Account(it.getInt("ID"), it.getString("NAME"), it.getString("ACCOUNT_TYPE")) }
    .forEach { println(it) }

thomasnield commented 8 years ago

This needs a ton of cleanup and possible consolidation, but let me know what you think of the use case above.

I'd also like to implement update operations that supports batch sizing as well. Another guy implemented this into RxJava-JDBC and I helped test it here. https://github.com/davidmoten/rxjava-jdbc/pull/30#issuecomment-204059335

thomasnield commented 8 years ago

Possible pattern for insertions using a Sequence for parameter inputs.

val conn = ConnectionProviderFromUrl("jdbc:sqlite:C:/git/rxjava-jdbc-batch-test/test.db").get().apply {
    autoCommit = false
}
val db =Database.from(conn)
val mySequence = (1..100000).asSequence()
data class Item(val id: Int, val value: Int)

fun main(args: Array<String>) {
   val insertKeys = db.update("INSERT INTO TEST_TABLE (VALUE) VALUES(?)")
        .parameters(mySequence)
        .batchSize(1000)
        .returnGeneratedKeys()
        .getAs(Int::class.java)
        .toList()
}

edvin commented 8 years ago

It would be unwise to not allow running multiple queries on a single connection. We might need to think about transaction demarcation here. Maybe you get passed a connection that you should do some work on, but not return it to the pool afterwards, because it's part of a larger set of operations that should be performed in the same transaction.

If that was possible, it would be easy to write adapters to support JavaEE/Spring down the line as well.

I think positional parameters is completely dead in 2016. If you have more than a couple, you're going to start making mistakes. My initial named parameter support is OK, but it's very convenient to write UPDATE person SET name = #{name} and not having to remember to call param("name", person.name) afterwards. What are your thoughts on named parameters?

edvin commented 8 years ago

Batch sizing seems nice :)

edvin commented 8 years ago

By the way, no need for select/update etc to take a lambda anymore, if the named parameter syntax uses #{}:

fun updatePerson(person: Person) = datasource.with(person) {
    update("person set name = #{name} where id = #{id}")
}

thomasnield commented 8 years ago

Okay, I think I see where you are going. Sorry I'm still thinking in Java... give me a moment.

thomasnield commented 8 years ago

Is this #{} syntax a Kotlin specification or something you are proposing...

edvin commented 8 years ago

It's something I'm proposing, possibly using Expression Language 3.0, which can now be used stand alone with a very slim dependency:

https://docs.oracle.com/javaee/7/api/javax/el/ELProcessor.html

Alternatively we could roll our own, what we need is so simple so that it might even be better. A proof of concept could however easily be implemented with EL.

edvin commented 8 years ago

EL is a 200kb dependency: http://mvnrepository.com/artifact/org.glassfish/javax.el/3.0.1-b08

edvin commented 8 years ago

I now know how to create an IntelliJ plugin that can give you content assist and check for invalid variables etc with this syntax :)

thomasnield commented 8 years ago

Interesting. I suppose nothing is stopping users from using the dollar sign ${ } but you lose the benefits of the PreparedStatement managing the parameters.

Overall though, I think you are right. Your proposal for named parameters is better.

thomasnield commented 8 years ago

Okay interesting, I was about to ask if there could be compile-time checking for the parameters.

edvin commented 8 years ago

If you use ${} you'll sooner or later shoot yourself in the sql-injection-foot :) Hehe

edvin commented 8 years ago

If you supply the parameters manually with param("key", "value") it's possible to forget a parameter. With this approach that's not possible.

thomasnield commented 8 years ago

haha, well said. I was thinking about your thoughts on connection management too and transactions. I think that makes sense. I'm having some difficulty translating reactive streams into imperative/functional hybrid Sequence operations. Maybe this is why we should allow having a select() and update() done on the Database and Connection levels...

thomasnield commented 8 years ago

If you call queries on the Database object, it automatically manages the connection. But if you call getConnection() from the Database object, you can run queries on that but are now responsible for giving the Connection back.

edvin commented 8 years ago

We should also support some conditional parts of the query. Haven't though this through, but consider:

/**
* Search for a person with an optional minimum age parameter
**/
fun searchPerson(minAge: Int? = null) = datasource.with {
    select("* from person")
    test(minAge != null) {
        "where age > #{minAge}"
    }
    to(Person::class)
    toList()
}

thomasnield commented 8 years ago

Are you sure that we are not going to run into platform nuances with this? Would test() be ran on the database end or JVM end?

edvin commented 8 years ago

test() has nothing to do with the database, it's appended to the query before it is executed :)

thomasnield commented 8 years ago

Would that be too hard to support with differences between platforms? For instance, what if I were to use regular expressions? I think each database platform executions regex functions differently...

thomasnield commented 8 years ago

And Kotlin's regular expression usage would probably not be easily translatable to the database platform's idea of regular expressions...

edvin commented 8 years ago

Only Kotlin's regex is used :) As I said, this is evaluated before the query is sent to the database. It's simply used to construct the SQL string.

thomasnield commented 8 years ago

Multi-targeted querying operations:

data class Account(val id: Int, val name: String, val accountType: String)

val db = Database.from("jdbc:sqlite:C:/Users/Thomas/OneDrive/Data/finance_rx.db")

//Automatic management of connection disposal
db.select("SELECT * FROM ACCOUNT WHERE ID = :id")
    .param("id", 1)
    .get { Account(it.getInt("ID"), it.getString("NAME"), it.getString("ACCOUNT_TYPE")) }
    .forEach { println(it) }

//Manual management of connection
val connection = db.getConnection()

connection.select("SELECT * FROM ACCOUNT WHERE ID = :id")
    .param("id", 1)
    .get { Account(it.getInt("ID"), it.getString("NAME"), it.getString("ACCOUNT_TYPE")) }
    .forEach { println(it) }

edvin commented 8 years ago

Yeah, that should be good for datasource vs connection :)

thomasnield commented 8 years ago

Okay, I think I'll scratch the Database class and use DataSource instead.

edvin commented 8 years ago

An interesting problem that arises from conditional query parts is "where to put the where keyword". To solve that, the framework could insert where and whitespace the first time any of statements inside where {} generates output:

/**
* Search for a person with optional minimum age and gender parameters
**/
fun searchPerson(minAge: Int? = null, gender: Gender? = null) = datasource.with {
    select("* from person")
    where {
        test(minAge != null) {
            "age > #{minAge}"
        }
        test(gender != null) {
            "gender = #{gender}"
        }
    }
    to(Person::class)
    toList()
}

This is crucial, because it's really messy to place the where keyword manually. Think about the ugly logic you'd have to write to keep track of this manually :)

thomasnield commented 8 years ago

That is true, okay I see how the test() works now. I misunderstood. I think this actually could be helpful when composing longer SQL statements, and having control over the conditional composition of it.

edvin commented 8 years ago

Exactly :) OK, gotta get to bed now, let's brainstorm more tomorrow, OK? :)

edvin commented 8 years ago

Alternative syntax:

select("* from person")
where {
    test(minAge != null, "age > #{minAge}")
    test(gender != null, "gender = #{gender}")
}

We could easily support both.

thomasnield commented 8 years ago

Yeah I'm going to go for a run. Takeaway is I think we have some great ideas and we can support multiple usage patterns, from functional Sequences to Groovy builders. I think we should continue brainstorming and figure out which paradigms we should support, and how well they will get along with each other.

thomasnield commented 8 years ago

Thanks @edvin have a good night! Maybe we should create a Slack channel for this later since I'm guessing we will have many involved discussions.

edvin commented 8 years ago

Yeah, and then maybe write down some examples for all the use cases and then study it a bit.

Have a great run :)

thomasnield commented 8 years ago

Thanks! Have a good night.

edvin / kdbc