gocraft / dbr

Additions to Go's database/sql for super fast performance and convenience.
MIT License
1.84k stars 208 forks source link

Postgres support #5

Closed joeshaw closed 8 years ago

joeshaw commented 9 years ago

dbr looks very interesting, and I'd love if it supported Postgres.

My general plan was to run the test suite with DBR_TEST_DRIVER="postgres" and add the github.com/lib/pq import to dbr_test.go and work through until the tests pass. Do you feel that's a good approach? How comprehensive do you feel the tests are?

Right off the bat I am hitting SQL syntax problems in the installFixtures() function in dbr_test.go. Looking through the codebase it looks like it was mostly designed with a single driver in mind. How would you like to split out the driver-specific parts of the codebase?

joeshaw commented 9 years ago

Quick and dirty first pass (mostly just quote fixes) is in https://github.com/joeshaw/dbr/tree/postgres. At this point it's just a search and replace -- doesn't handle the differences between mysql and postgres quoting. Waiting for guidance on how best to do it.

Postgres doesn't support LastInsertId -- you have to add something like RETURNING id and use Query rather than Exec to extract it, so that'll require some additional tweaking of the tests.

joeshaw commented 9 years ago

Hmm, RETURNING might be a bigger chunk of work. You can use it for any of INSERT, UPDATE or DELETE in Postgres and in my own apps I do.

I wonder if moving some of the logic from select_load.go and select_return.go to a more generic interface that can be implemented by the InsertBuilder, UpdateBuilder, and DeleteBuilder makes sense.

joeshaw commented 9 years ago

Ok, my branch now passes the tests on Postgres, but still lots of grossness in there. The hacks around testing RETURNING are pretty gross and will need a nicer cross-driver solution.

tyler-smith commented 9 years ago

Awesome, glad somebody's putting some effort into getting postgres support working!

I'll try and check this out this weekend. Yeah we'll definitely need to make tests more driver-neutral to be friendly to a wide array of developers.

I see what you mean with regards to RETURNING and select_load.go. I'll have to take a look into how we could make non-SELECT statements support data loading.

GeertJohan commented 9 years ago

:+1: I would definitely like to see Postgres support here!

mgutz commented 9 years ago

You guys might want to look at my postgres fork of this project. To handle RETURNING use QueryScalar or QueryStruct which were previously named LoadValue and LoadStruct. See insert tests

I replaced ? placeholders with ordinal placeholders, ie $1, $2 ... $n, which are better for reducing duplicate args in non-trivial queries and debugging.

My intention was to do a PR but I'm doing heavy refactoring to better separate concerns between builder, runner and logger (EventReceiver).

pkieltyka commented 9 years ago

+1

mgutz commented 9 years ago

I started with a fork and refactored heavily and the result is: https://github.com/mgutz/dat. It's fast.

pkieltyka commented 9 years ago

this is cool, but, why not fork dbr and submit back to it..? I feel it fragments the effort.. and is so typical of so many open source projects

pkieltyka commented 9 years ago

dbr or dat, they should be easy to support mysql and postgres .. and maybe sqlite3

tyler-smith commented 9 years ago

Hey guys, glad to see so much interest in adding postgres support. I know it's popular and there are so many people wanting to use dbr but can't and that sucks. I am working on fixing that!

I agree with @pkieltyka that fragmentation is not ideal. We want dbr to embrace and grow with the needs of the community instead of pushing people off toward yet another new package.

@mgutz The postgres support for dat looks cool. It seems the goals of dat and dbr are strongly aligned. I would love to speak about combining our efforts here. I'll shoot you an email if you don't mind so we can discuss a little more about your vision for dat.

dbr is definitely going to have postgres support, and we're committed to getting it out soon. If Mario wants to work together with us here that'd be awesome, but if not I will get a postgres development branch up in the next couple of days.

mgutz commented 9 years ago

In reality, dbr and dat are just query builders with some helpers to execute queries. I refactored dbr so much that it did not make sense to submit a pull request. Why? There isn't much use for the Session in dbr since it's just a passthrough to Connection. A session in dat is a transaction (due to database/sql pooling logic) and requires an AutoCommit or AutoRollback. Most users think of a session as a single connection. The one-off queries that were executed on Session in dbr are executed on Connection directly in dat.

Moreover, sqlx with interpolation is as fast dbr in my benchmarks and handles edge cases better (pointers). That is why I chose sqlx instead of continuing with the custom database/sql execer in dbr. It's probably because the dbr folks are focused with MySQL that they have not run into my issues. Dat may not fully interpolate if it is more advantageous to pass through arguments to the underlying driver.

Dat has clear separation of concerns and allows for different execers like sqlx, sql, pgx. The execer interface is not tied to database/sql at all. Exec, for example, does not return sql.Result. There is also Dialect interface in place for adding other databases just in case I or others want to add support for MySQL and SQLite.

pkieltyka commented 9 years ago

yea, I agree that dbr and dat are primarily just query builders. Maybe you guys should consider that, build the queries, then use github.com/jmoiron/sqlx under that. Sqlx is very robust and has been designed over a long period of time. The entire thing is < 2000 LOC and its main job I'd say is creating a good abstraction among mysql, postgres, sqlite3 with support of scanning types and mapping to structs.

..just offering my 2 cents as someone seeking for a proper data querying + mapping library. I appreciate the work you guys have put into dbr + dat.

pkieltyka commented 9 years ago

@mgutz ah nice, you are indeed using sqlx! very cool.

tyler-smith commented 9 years ago

Sessions in dbr combine a long-lived Connection with a short-lived, session-dependent, EventReceiver.

The reason this is so powerful is that logging/instrumentation that occurs in dbr can be scoped to a specific unit of execution like a web request or a background job.

If you use an EventReceiver compatible package like gocraft/health then all errors, events, and timings can be sent to Statsd, Bugsnag, etc by simply setting it in the health config. By using dbr's Sessions these will now be segmented by the particular task that was being executed which provides a ton of insight into whats actually going on in your application.

For example, you could have a web server that creates a Connection on startup, and for every request you create a health job (which is just an EventReceiver) and a dbr Session for it. Your health job can be tagged with all request-specific details. If an error or performance issue occur you will be able to instantly see who it happened to, where it the code it was, etc. Simply using a global Connection leaves no context as to why things are happening.

It is largely our fault for not illustrating this story a little better. We are going to work on getting health ready for others to understand and use. (It is mostly finished, we mainly just need documentation)

tyler-smith commented 9 years ago

Using sqlx to manage the data loading would make it much more difficult to get the same level of instrumentation that dbr currently affords, however it does no harm to look into how we could leverage what it already gives us.

Regardless of the solution dbr will be providing postgres support and that will almost definitely come in the form of a generic interface that would allow any arbitrary driver support.

pkieltyka commented 9 years ago

@tyler-smith I hear that. I am sure jmoiron (author of sqlx) would be open to adding an interface to plug in the instrumentation stuff.

mgutz commented 9 years ago

@tyler-smith Good point. I think session, more specifically the runner interface, just needs an ID() which returns a UUID. When I used to do enterprise scale C# apps, I would either use an ID in thread-local storage or a session ID to log units of work. Existing logging frameworks provide analytics if fed structured data.

I removed EventsReceiver in favor of structured logging to use a standard logging interface. The ideas are roughly the same

// runner can be connection, session or transaction interface
log.Info("updating payment", "groupId", runner.ID(), "duration", time.Since(startTime).Nanoseconds())

logxi, logrus and log15 provide structured logging

tyler-smith commented 9 years ago

@mgutz Looking at logxi, and they way that dat uses it, I think the biggest difference is that logxi seems to be primary for logging (i.e. a stream of diagnostic messages) as opposed to potential EventReceiver implementations which could more as an instrumentation package (like gocraft/health).

Using EventReceiver/health you easily send all your data to statsd where you could perform all sorts of queries on that data. For example you can easily find the 90th percentile of slow queries, or see how many times a given statement is executed each minute. That's because instead of "severity"-based logging methods the EventReciever methods have statistical semantics attached. Using a different EventReceiver method allows you to say what kind of data your streaming. Is it a simple count? A timer? An error? A gauge?

I would say think of EventReceiver and health more like New Relic style monitoring systems rather than a basic structured logging package. Of course, the reason for the EventReceiver abstraction is so that you can decide exactly how you want this to act so that you can fit it to your application's needs. A super basic web app is probably fine to use a simple structured logging system like logxi, but once it starts growing up I think a robust instrumentation package is called for. Having the EventReceiver interface allows the developer to control how he wants to handle it instead of forcing them to use a particular package.

But we shouldn't let this issue get any more off-topic. If you or anybody else would like to discuss differences between dbr and dat, or EventReceiver and logxi, feel free to contact me by email or open a separate issue.

mgutz commented 9 years ago

Six of one, half a dozen of the other. logxi is JSON underneath as is logrus and log15. Most of us understand log.Info(msg, "key1", value, "key2", value) which is the interface logxi and log15 implement. In short, it is structured like EventKv, EventErrKv, TimingKv. If a hint is really needed, then add a "kind" key to specify event, timing, etc.

Collectors can forward metrics based on simple rules. Let machines parse your logs

if entry.name == "dat:sqlx" and entry.duration > then it's timing
else if entry.name == "dat:sqlx" and entry.err != "" then it's an error event
else if entry.name == "dat:sqlx" then it's just an event

Sorry, I just had to get that in and will stop beating this issue.

pkieltyka commented 9 years ago

hmm.. logging metrics isn't even close instrumentation. Either way, it would be nice to have logging and instrumentation support (as separate parts..).

pkieltyka commented 9 years ago

@tyler-smith btw, for some ideas on other instrumentation interfaces, have a look at: https://github.com/heroku/instruments its pretty well designed IMO

taylorchu commented 9 years ago

dbr now has postgres branch. I separated query building from other parts of the library. Please try it out, and give some suggestions. Thank you.

noonien commented 8 years ago

postgres requires enclosing the elements in curly braces. dbr currently encloses array elements in parenthesis

taylorchu commented 8 years ago

go slice is translated to tuple like IN (1, 2, 3) instead of array.

noonien commented 8 years ago

I apologize, what I mean to say is that postgres requires array elements to be enclosed in curly braces. This is needed for array columns.

tyler-smith commented 8 years ago

@taylorchu Given that the main issue here, "Postgres support", is in master I'm going to close this issue. If you disagree feel free to re-open.

@noonien I have pulled your comment into a separate Github issue: https://github.com/gocraft/dbr/issues/53