golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
122.92k stars 17.52k forks source link

database/sql: add option to customize Begin statement #19981

Open zombiezen opened 7 years ago

zombiezen commented 7 years ago

Currently, the *DB.BeginTx method looks like this:

func (db *DB) BeginTx(ctx context.Context, opts *TxOptions) (*Tx, error)

type TxOptions struct {
    // Isolation is the transaction isolation level.
    // If zero, the driver or database's default level is used.
    Isolation IsolationLevel
    ReadOnly  bool
}

However, there are transaction options across drivers that don't fit into these fields. For instance:

It would be nice to have another option in TxOptions that allows you to change which statement is used. As an idea:

type TxOptions struct {
    // Isolation is the transaction isolation level.
    // If zero, the driver or database's default level is used.
    Isolation IsolationLevel
    ReadOnly  bool

    // Modifiers is a partial SQL statement that specifies database-specific transaction options.
    // The database will prefix "BEGIN " or "START TRANSACTION " to form a full SQL statement, whichever is appropriate.
    // The string may or may not have a trailing semicolon.
    // If a database does not support transaction options, then Modifiers is ignored.
    // If not empty, Isolation and ReadOnly are ignored.
    Modifiers string
}

While this approach is convenient and ensures that the statement is a BEGIN or START TRANSACTION statement, it's not particularly performant, as it requires the statement to be prepared on each BeginTx. However, you could imagine adding a method to DB like:

// PrepareBegin prepares a SQL statement to begin a transaction.
// modifiers has the same meaning as in TxOptions.
func (db *DB) PrepareBegin(ctx context.Context, modifiers string) (*Stmt, error)

type TxOptions struct {
    // ...

    // Stmt is used to start the transaction if non-nil.
    // It must be the result of a call to PrepareBegin.
    // If non-nil, all other fields are ignored.
    Stmt *Stmt
}
kardianos commented 7 years ago

@zombiezen I agree with your basic premise that the TxOptions could have more fields. In some of the initial debate going into this it did (or could be extended by each sql driver). I'm not fond of the modifier string approach and if we did allow drivers to customize it, I would probably lean to more of a key value pair where the driver had setters and getters for those values.

As far as addding Stmt to the TxOptions, am I wrong to think you could use Tx.StmtContext? I'm not really happy with how Stmt works in the sql package today. It adds huge complication to the code and ends up having a sub-pool of connections it can use. If you really want a high performance loop, it would seem better to pull a Conn from the DB, prepare a stmt on that, then just exec that in a Tx repeatedly. Unless I'm missing the point.

If you created a strongly typed values for TxOptions for each of these cases, what would TxOptions look like? Not that we would go that route, but It might layout some options.

zombiezen commented 7 years ago

Tx.StmtContext would be too late: the transaction would already be created by that point. You don't specifically need to introduce a function, but it is a way of ensuring that the statement created is one that is intended to be used for a BEGIN, since DB.PrepareContext could be used to prepare an arbitrary statement.

Here's what it would look like with just more options:

type TxOptions struct {
  // ...

  // SQLiteLock is an enum specifying deferred (default), immediate, or exclusive.
  SQLiteLock SQLiteLock

  // PostgresDeferrable is set to true to make a DEFERRABLE transaction.
  PostgresDeferrable PostgresDeferrableOption

  // MySQLConsistentSnapshot is set to true to make a WITH CONSISTENT SNAPSHOT transaction.
  MySQLConsistentSnapshot bool
}

type SQLiteLock int

// SQLite transaction locks.
const (
  SQLiteDeferred SQLiteLock = iota
  SQLiteImmediate
  SQLiteExclusive
)

type PostgresDeferrableOption int

// PostgreSQL deferrable transaction options.
const (
  PostgresDefaultDeferrable PostgresDeferState = 0
  PostgresDeferrable PostgresDeferState = 1
  PostgresNotDeferrable PostgresDeferState = -1
)
kardianos commented 7 years ago

@zombiezen Regarding Stmt issue, would the following be correct? When a Stmt is provided to TxOptions (1) The list of connections that the Stmt is already prepared on would be preferred over a connection where Stmt was not prepared on and (2) that if no available connection had the Stmt prepared on it then it would get any available connection, prepare the Stmt on it, then begin a transaction.

What if TxOptions had the following two methods:

package sql

// WithLock stores the value under the key value.
func (opts *TxOptions) WithValue (key interface{}, value interface{})

// Value returns the value stored under key.
// If the key is not found nil is returned.
func (opts *TxOptions) Value(key interface{}) interface{}

package sqlite

type lockKey struct{}

type Lock int

const (
  LockDeferred Lock = iota
  LockImmediate
  LockExclusive
)

func WithLock(opts *sql.TxOptions, lock Lock) {
    opts.WithValue(lockKey, lock)
}

func LockValue(opts *sql.TxOptions) Lock {
    v, ok := opts.Value(lockKey).(Lock)
    if !ok { return LockDeferred }
    return v
}

package usercode

opts := &TxOptions{ReadOnly: true}
sqlite.WithLock(opts, sqlite.LockExclusive)

db.BeginTx(ctx, opts)
zombiezen commented 7 years ago

When a Stmt is provided to TxOptions (1) The list of connections that the Stmt is already prepared on would be preferred over a connection where Stmt was not prepared on and (2) that if no available connection had the Stmt prepared on it then it would get any available connection, prepare the Stmt on it, then begin a transaction.

That sounds reasonable.

What if TxOptions had the following two methods:

That could work, although AFAICT it does push more work and API surface onto database driver authors. Having it go through the statement mechanism is an operation that would work for every database driver I'm aware of, and would avoid the untyped bag of driver-specific Tx values.

kardianos commented 7 years ago

I'm not sure I understand how providing a Stmt would solve the driver specific TxOption values?

zombiezen commented 7 years ago

Ah, now I understand where we're talking past each other.

The assumption I'm making is that these database-specific transaction options (and potential future ones) can all be described in terms of SQL statements, nothing out-of-band sent on the wire. I'm asserting that ReadOnly and Isolation are hints that the application (via database/sql) sends to the specific database driver as to how to construct the statement that begins a transaction. I am then concluding that instead of trying to come up with an API that gives driver-specific ways of constructing the statement, that an alternative with less moving parts would be to allow specifying the begin transaction statement in SQL syntax.

kardianos commented 7 years ago

Ah. I'm not convinced that would work for all drivers. Thinking of sql server and Oracle in perticular.

On Mon, Sep 11, 2017, 13:07 Ross Light notifications@github.com wrote:

Ah, now I understand where we're talking past each other.

The assumption I'm making is that these database-specific transaction options (and potential future ones) can all be described in terms of SQL statements, nothing out-of-band sent on the wire. I'm asserting that ReadOnly and Isolation are hints that the application (via database/sql) sends to the specific database driver as to how to construct the statement that begins a transaction. I am then concluding that instead of trying to come up with an API that gives driver-specific ways of constructing the statement, that an alternative with less moving parts would be to allow specifying the begin transaction statement in SQL syntax.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/golang/go/issues/19981#issuecomment-328643180, or mute the thread https://github.com/notifications/unsubscribe-auth/AAuFsWbJDbrNf56i5I6EiLdpusFmRCfQks5shZMXgaJpZM4M9_m9 .

zombiezen commented 6 years ago

@kardianos Can you point me to some docs on how those work? SQL Server looks like it has some options that could be set using normal SQL syntax, and I'd imagine the Stmt option could be ignored on Oracle.

kardianos commented 6 years ago

The stmt option wouldn't work for any protocol or implementation that relied on setting some binary (non-sql text) field or implementation that worked through a higher level API.

But put this another way: couldn't you think of the Stmt that you are thinking of as a driver defined option that that could get passed to the driver easily? like you suggested before, it could be a single string. Also, if you have multiple different options this would explode out. Let's say a driver has two independent options, it would need to expose three different statements. If any of the options took names, or identities you would end up with Stmt builders in the driver.

I'm really not liking the Stmt in TxOptions.

I would spring for a DriverOptions string parameter that can be set manually or through a driver builder before doing a Stmt I think.

kardianos commented 6 years ago

@zombiezen Ross, what's the application motivation for requesting these Tx options?

zombiezen commented 6 years ago

The SQLite case is the motivation, since the file-locking behavior can only be changed when starting the transaction.

kardianos commented 6 years ago

I would be okay with the WithValue example in https://github.com/golang/go/issues/19981#issuecomment-328284833 . It would have very little API addition, it would be easy for drivers to use, and the pattern would be familiar based on context.

Of course, the other option to implement this would be to have the drivers expect the values to come through the context value bag. That is what the vitesse database/sql driver does today. Would that be too much of a hack for SQLite?

zombiezen commented 6 years ago

Yes, an untyped bag would work for this case (although I have qualms, since this will be covered by the Go 1 compatibility promise and I worry a bit about API evolution). If we do go this route, I think adding it to TxOptions is better than a Context Value, since a Context Value could be propagated farther than intended.

kardianos commented 6 years ago

@tgulacsi @mattn @mjibson @jackc I'd like some feedback on this proposal of adding driver specific options to TxOptions.

I don't care for Ross' proposal to add a Stmt to TxOptions that would get run to start a Tx. I could maybe see adding some type of bag of values, similar in API to the context value. When we start looking at driver specific transaction options, there are named transactions. I'd also like to be able to nicely handle Tx Options that might themselves be wrapping another API and as such might not issue BEGIN TRAN statements themselves.

Even if it isn't nice, I'd like to consider using Ctx as an existing escape hatch for passing values. What is the frequency an application will end up needing this?

This may be even worse, but we could add a Driver interface{} field to TxOptions and let the driver define any misc options they want to set.

tgulacsi commented 6 years ago

I don't know any such requirement in Oracle - transaction isolation mode can be set anytime, and no other knobs exists for SET TRANSACTION. (Maybe the one ROLLBACK SEGMENT, but that can be specified with "SET TRANSACTION ROLLBACK SEGMENT" - the programmer knows what she wants, doesn't she?

The Context allows passing ANYTHING to the underlying driver - so a driver-specific option can be passed, too.

jackc commented 6 years ago

All transaction options in PostgreSQL can be set with SET TRANSACTION so this is not a requirement for PostgreSQL. That said, it is convenient and I expose that functionality in pgx's native BeginEx. I think I prefer adding a driver specific field to TxOptions over using Context.

kardianos commented 6 years ago

@zombiezen I don't approve of the embedded Stmt in TxOptions from a overall SQL perspective, but about any other option I'm fine with. After that if and what is included is more of an overall Go API decision.

@ianlancetaylor Ian, do you or someone else on the Go core team have an opinion about a possible API addition to TxOptions for driver specific transaction options? The options that I'm aware of on the table:

I don't personally mind (ab)using context to store a custom value. I understand other will be highly adverse to it. To me DriverOptions interface{} seems reasonable and flexible, it just doesn't say much. DriverOptions string seems like asking for trouble, similar to how Open(dsn string) has been a bit too inflexible, and would require some type of parsing or verification probably (unless I'm missing something). Adding get/set value on TxOptions seems really redundant to context value bag. Adding *sql.Stmt assumes that starting a Tx will be a textual command, when it may not (esp if driver wraps another (C) API and blocks out protocols that have protocol level Tx options, though maybe that is too exotic.

Advice on an API esp in Go std lib would be appreciated.

ianlancetaylor commented 6 years ago

I haven't thought much about it. It's an interesting question. The whole point of the database/sql package is to provide a generic API that works with a bunch of different databases. But now @zombiezen is saying that the generic API is insufficient. For most cases we in effect hide database differences because the program has to pass an explicit SQL string, and we don't try to generate that for them. But for starting a transaction we do generate "BEGIN" and let the driver build options. So one approach is to let the user tell us the exact database-specific command to generate to start a transaction. That is kind of what @zombiezen was initially suggesting, but I'm suggesting a bit further: add a new method BeginCmd(ctx context.Context, cmd string) so that the user has to explicitly say db.BeginCmd(ctx, "BEGIN TRANSACTION WITH MUSTARD") or whatever works for their specific database. Or since this is moderately unusual perhaps we should provide only BeginStmt(ctx context.Context, stmt *Stmt).

You suggest that this isn't great because starting a transaction is not always a text command, but that's OK because even it's not the driver could still interpret additional options out of the string according to some driver-documented mechanism.

Normally I wouldn't suggest a string based approach, but this package is already based around a string-based approach: that's inherent in the Query method.

But there is a lot I don't know about this.

kardianos commented 6 years ago

@ianlancetaylor Thank you for the feedback; I'll need to think about it for a bit.

One correction however is that the sql package doesn't generate any "BEGIN" at all (the exact accepted text phrase is different from DB to DB), *sql.DB calls the driver.Conn.Begin method and lets the driver do whatever it needs to do.

kardianos commented 6 years ago

@zombiezen I'd be okay with adding an DriverOptions interface{} field to TxOptions and let the driver specify something special. Would that work for you? I could get a CL for that.

zombiezen commented 6 years ago

I would favor @ianlancetaylor's approach, because options are usually given as SQL anyway, and the fallback for a database that doesn't support these would just be to ignore the string or parse its own format out of it. This has the advantage that if you move your code from one SQL driver to another (assuming the same underlying database) that it is likely to be interpreted the same.

(The solution you propose works, but I'd rather wait for the right approach than try to rush a wrong one, since it's the standard library.)

kardianos commented 6 years ago

@zombiezen I don't see how @ianlancetaylor approach would work. The sql package isn't generating the BEGIN statement in the first place and would be pointless in a driver that issued commands to an external API as a wrapper. But feedback appreciated.

EDIT: I guess having a string options would be workable. It would either need to be a BEGIN type statement the driver could run, or it could be a set of string options. I don't really buy the portability PRO though, they may have different ways of interpreting it. Also, if we do ever add nested transactions as an API, those would need to be emulated with savepoints and rollback tos in many databases, so a direct execution of BEGIN might be a bad thing.