Proposal: Unified SQL package and move drivers into nimble packages

theduke commented 8 years ago

I hereby propose a new sql package that provides a standardized access to varying SQL drivers.

The sql package would contain a Db type that is used for creating a database connection and executing queries. This type is driver independent. Different drivers register themselves to the sql package on import.

All drivers currently in the core (sqlite3, postgres, mysql, odbcsql) are moved out and into custom nimble packages.

The Db object uses the driver under the hood to create connections, execute the queries, and manages prepared statements. The Db object will, optionally, maintain a connection pool that allows concurrent/threaded code to painlessly work with the database. A pool of connections is maintained, and connections are re-used.

If the driver is capable of prepared statements, the Db object also maintains a mapping of which connections have which prepared statement available, and uses an appropriate connection if available, waits for one, or prepares the statement on a free/new connection.

The sql package provides two base types: Driver and Connection, which have base methods that must be implemented by a driver.

Some rough code to demonstrate. This is nothing too well thought out, just how I imagine it to be.

Connection

type Connection = ref object of RootObj
  discard

# Following methods must be implemented by the drivers.

method lastUsed(c: Connection)): float {.base.} =
  # Reports the time the connection was actively used, in epochTime.
  assert false, "not implemented.."

method isAlive(c: Connection)): bool {.base.} =
  # Checks the connection health.
  assert false, "not implemented"

method keepAlive(c: Connection)) {.base.}
  # Informs the connection that it should maintain it's open state.
  # The Db will trigger this regularily for connections in the pool that have not been used for a while.
  # This is only needed for drivers who do not maintain an active connection state themselves.
  # If they do, they can just provide an empty method.  

method close(c: Connection)) {.base.} =
  # Closes the connection
  assert false ...

# SQL methods.

method exec(c: Connection, statement: SqlStatement, args: varargs[string, `$`], prepared: bool = false) {.base.} =
  # Execute a sql statement. 
  # The prepared flag specifies wheter a prepared statement should be setupped / used.
  assert false,...

method insert(c: Connection, statement: SqlStatement, args: varargs[string, `$`], preparde: bool = false): SqlRow =
  # Executes an insert statement, and returns associated data.
  # For drivers which support lastInsertID, a single column with the insert id is returned.
  # For drivers like postgres, if a returning clause is present, it's columns will be returned.
  # A driver like postgres might decide to automatically add such a returning statement, but this is optional.
  ...

iterator query(c: Connection, statement: SqlStatement, args: varargs[string, `$`], prepared: bool = false): SqlRow =
  assert false,...

method queryRows(c: Connection, ..., prepared: bool = false): seq[SqlRow] =
 ...

Driver

type Driver = ref object of RootObj
  discard

method openConnection(options: DriverOptions): Connection =
  # Opens a new connection.
  ...

Db

# Global driver registry.
var drivers: Table[string, Driver]

type Db = ref object
  maxConnections: int # Maximum connections that may be opened.
  initialConnections: int # Amount of initial connections to create on open, defaults to 1.  

  connectionIdCounter: int
  connections: Table[int, Connection]
  activeConnections: Table[int, Connection]
  ...
  ...

proc registerDriver(name: string, driver: Driver, capabilities: DriverCapabilities) =
  # Registers a driver with the sql package.
  # The capabilites object informs sql pkg about driver capabilities.

proc open(driver: string, driverOptions: varargs[string]): Db =
  # Create a new Db object, initialize it, 
  ...

proc getConnection(db: Db): Connection =
  # Retrieve an available connection from the pool.
  # If none are available, open a new connection (unless maximum connections are reached).
  ...

proc releaseConnection(db: Db, c: Connection) =
  # Release a connection back to the pool.

proc close(db: Db) =
  # Close all connections in the pool and do a cleanup.

# All SQL methods from the connection are present on the Db object.

proc exec(db: Db, statement: SqlStatement, args: varargs[string, `$`], prepared: bool = true) =
  var con: Connection
  if prepared:
    con = db.getPreparedConnection(statement)
  else:
    con = db.getConnection()
  try:
    con.exec(stmt, args, prepared)
  except:
    db.releaseConnection(con)
    raise
  db.releaseConnection(con)

...
...
...

Let me know what you think!

Araq commented 8 years ago

The current db-* modules are not perfect, but what they get right is that they don't use subtyping and runtime polymorphism where it's not required. Your design throws away this advantage for no good reason. Stop thinking in Java please. ;-)

Varriount commented 8 years ago

What if one wants to support multiple database types at runtime?

theduke commented 8 years ago

@Araq

What they get wrong though, is that you can't switch between drivers at all without re-writing a lot of your code. Having similar interfaces for the drivers is pretty pointless when you can't switch between them.

For example, testing with sqllite, production with postgres. Which might be a bad example because it's not a good idea, but the general assumption holds. Or starting a project with SQLite, and if it gets big enough, switching to a proper database.

I've basically already written a light version of this for an ORM I'm working on. And the current design makes this really much more awkward than it needs to be.

@Varriount

That wouldn be a problem, one could just instantiate multiple Db objects for different drivers.

Also, the drivers could/should be written in a way that they can still be used standalone. So basically just the current versions of db_postgres etc.

Just with the added feature of using them wrapped with the SQL package if one cares about being able to switch the backend.

GO for example has a very similiar SQL package, and it's the defacto standard for using SQL with Go.

dom96 commented 8 years ago

If this is how you want the database modules to work then by all means go ahead and implement it that way. But for now as a Nimble package.

The standard library modules are here to stay for the foreseeable future, if you can create a Nimble package that is superior then we may adopt it into the standard library.

Araq commented 8 years ago

What they get wrong though, is that you can't switch between drivers at all without re-writing a lot of your code. Having similar interfaces for the drivers is pretty pointless when you can't switch between them.

It's not similar, it is supposed to be identical. And we ported the forum from postgres to sqlite with no issues and without code rewrites, so I wonder why you had different experiences.

jlp765 commented 8 years ago

@theduke standardized access to varying SQL drivers. As @Araq has said, the modules provide a consistent interface. It is true that the db connection identifier returned from open() is not common, but unique to each database. However, once you have that identifier, you use that dbConn to connect to a consistent set of procs. Long way of saying Why does the dbConn need to be unified?

Because all the db_modules expose the same procs, you just change the import module, tweak your queries, change the database/username/password for open(), and it works to a new database type.

@Varriount
What if one wants to support multiple database types at runtime? You can currently connect to two different database types at the same time.

import db_sqlite, db_mysql

var
  m = db_mysqlite.open(....)
  s = db_sqlite.open(.....)

m.exec(....)
s.exec(....)

PR #3614

This PR combines the db_sqlite, db_mysql and db_postres into one module. You call the initDb() with a database type to get your unique id (for mapping functions to the correct db_xxx database module), and then the procs are the same as the underlying db_xxxx modules. Maybe this accomplishes the same thing??

@theDuke
maintain a connection pool that allows concurrent/threaded code to painlessly work with the database. I don't know how the current db_xxxx modules work with the Nim thread/spawn/... Is this a real issue currently?

One improvement ?

Provide a base module or import file that defines the common things like exception classes, tags, etc, so that EDb exception is globally defined, rather than having db_sqlite.EDb different to db_mysql.EDb, and likewise for tags

dom96 commented 6 years ago

I suppose this never panned out. Looks like the OP went to the dark side. Closing in any case, if you're still interested in creating this then please go for it and post a link here to your awesome package :)

theduke commented 6 years ago

@dom96 haha I guess I went to the dark side, namely Rust. ;)

dom96 commented 6 years ago

indeed. You're always welcome back in the light side ;)

nim-lang / Nim