go-sql-driver / mysql

Go MySQL Driver is a MySQL driver for Go's (golang) database/sql package
https://pkg.go.dev/github.com/go-sql-driver/mysql
Mozilla Public License 2.0
14.56k stars 2.32k forks source link

[api design] Encoding values for LOAD DATA LOCAL INFILE #1416

Open Jille opened 1 year ago

Jille commented 1 year ago

Hi. I'm working on a feature for sqlc to do a bulk insert. sqlc generates a struct with the columns of the table, and I want to do a bulk insert of those.

Now my challenge is how to encode those values into a TSV as supported by https://dev.mysql.com/doc/refman/8.0/en/load-data.html#load-data-field-line-handling.

I wrote a small library for this (https://github.com/hexon/mysqltsv/blob/main/mysqltsv.go) but to correctly encode time.Time I need to know the Location from https://pkg.go.dev/github.com/go-sql-driver/mysql#Config.

What would be the cleanest way to support this?

I'd love to hear it if you any other suggestions.

methane commented 1 year ago

Most simple and quick way is just user pass the location to sqlc. User knows dsn or mysql.Config so that user knows the correct loc.

For future, my idea is:

// in mysql driver
type MySQLConn interface {
  private()  // Other module can not implement this interface. We can add more APIs in the future.

  Location() *time.Location  // Get Config.Location of this connection. (This may differ from `time_zone` connection variable.)
  LoadFromReader(r *io.Reader) error // Use this instead of RegisterReaderHandler.
}

// in user code.
var f io.Reader // some data.
err := conn.Raw(func (dc any) error {
  mc := dc.(MySQLConn)
  return mc.LoadFromReader(f)
})
Jille commented 1 year ago

I feel that would be a burden on the user. It requires them to plumb through the connection settings to every place that needs it.

I like that interface. (Though I'm not entirely sure how LoadFromReader would work. Would that run a LOAD DATA INFILE query itself?) (And it wouldn't be perfect for sqlc, because it would require implementing the Raw method for any middleware.

How do you feel about #1419? That'd allow me to encode correctly.

methane commented 1 year ago

I find my old idea:

https://github.com/go-sql-driver/mysql/pull/1060#issuecomment-585574043

methane commented 1 year ago

My old idea doesn't require new interface.

Jille commented 1 year ago

While I like the idea of avoiding a register+unregister and passing the data to the call, it does have a few downsides:

methane commented 1 year ago

I feel that would be a burden on the user. It requires them to plumb through the connection settings to every place that needs it.

I feel it is far better than #1419. #1419 add ugly API only for getting Config.Loc.

Jille commented 1 year ago
  • Config.Clone() is bit heavy. Why need to clone all config only for a *Location?

I don't mind changing that. With public libraries I tend to try to make them user-proof and make sure they can't break things by changing an internal value.

  • Config contains password. Why need to add surface to leak password only for a *Location?

Fair enough.

  • Many users (including me) just use only UTC. No need to pass location around.

Me too, sadly sqlc can't assume all users do that.

  • Some users using Location just need to pass it to bulk insert. It is not big burden.

    • If it is burden, user can store it in Context.

I'd totally do that in my own project, but it will break sqlc users that switch from simple INSERT INTOs to a LOAD DATA INFILE and suddenly correctly encoding time.Time's depends on a context variable.

I've updated #1419 with a new proposal, to only pass a struct with the *Location. (A struct so we can later extend it without breaking compatibility.) What do you think?

Alternatively, what do you think of adding an API that handles all the TSV encoding? That API would be really useful for other go-sql-driver/mysql users.

methane commented 1 year ago

I've updated #1419 with a new proposal, to only pass a struct with the *Location. (A struct so we can later extend it without breaking compatibility.) What do you think?

I still hate it. It increase public APIs only for your use case.

Alternatively, what do you think of adding an API that handles all the TSV encoding? That API would be really useful for other go-sql-driver/mysql users.

API compatible to pg or pgx is better than registry.

Jille commented 1 year ago

Sorry, I'm not sure what you mean with your last sentence. (Particularly, what do you mean with "registry"?)

methane commented 1 year ago

I meant RegisterXxx() APIs. I don't like them and avoid adding more such APIs.

Jille commented 1 year ago

Ok, I've thought of a way to make your proposed API work with sqlc, so I've sent a PR for that: https://github.com/go-sql-driver/mysql/pull/1454