golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.39k stars 17.59k forks source link

proposal: database/sql: add methods to scan an entire row into one value #61637

Open jba opened 1 year ago

jba commented 1 year ago

Edited: struct field names are matched to columns case-sensitively. Edited: untouched storage is ignored rather than zeroed.


I propose adding the method

ScanRow(dest any) error

to the Row and Rows types. ScanRow attempts to populate dest with all the columns of the current row. dest can be a pointer to an array, slice, map or struct.

Motivation

ScanRow makes it more convenient to populate a row from a struct. Evidence that this is a desirable feature comes from the github.com/jmoiron/sqlx package, which has 9,800 importers, about 1,000 forks and about 14,000 GitHub stars. (To be fair, sqlx provides several other enhancements to database/sql as well.) The ScanRow method brings database/sql closer to parity with encoding/json, whose ability to unmarshal into structs and other data types is widely used.

Another motivation comes from the likely addition of iterators to the language. Without something like ScanRow, an iterator over a DB query would have to return a Row or Rows, since the Scan method is variadic. An iterator like that still improves on using a Rows directly because it makes error-handling more explicit and always calls Close. But we could do better with ScanRow, because the iterator could deliver a single value holding the entire row:

type Product struct {
    Name string
    Quantity int
 }

func processProducts() {
    for p, err := range sql.Query[Product](ctx, db, "SELECT * FROM products") {
       if err != nil {...}
       // use p
    }
}

This proposal doesn't include that iterator; I show it merely for motivation.

Details

ScanRow acts as if each part of its argument were passed to Scan.

If the value is an array pointer, successive array elements are scanned from the corresponding columns. Excess columns are dropped. Excess array elements are left untouched.

If the value is a slice pointer, the slice is resized to the number of columns and slice elements are scanned from the corresponding columns.

If the value is a map pointer, the underlying key type must be string. For each column, a map entry is created whose key is the column name and whose value is scanned from the column. Unnamed columns are assigned unique keys of the form _%d for integers 0, 1, .... Other entries in the map are left untouched.

If the value is a struct pointer, its exported visible fields are matched by name to the column names and the field values are scanned from the corresponding columns. The name matching is done as follows:

  1. If the field has a struct tag with key "sql", its value is the column name.
  2. Otherwise, the column name is matched to the field name case-sensitively.

Unassigned columns are dropped. Unassigned fields are left untouched.

dsnet commented 1 year ago

My apologies for pinning this. I'm not sure how I did that.

benhoyt commented 9 months ago

For what it's worth, here is a cut-down version of ScanRow implemented as a function which takes an *sql.Rows. It only handles dest as a struct, and it doesn't support embedded fields (probably not too hard to add). However, it's already proved useful for me, and saved a bunch of error-prone boilerplate, for example, on a query with 28 fields.

flibustenet commented 8 months ago

@jba would we wakeup this proposal for 1.23 ?

jba commented 8 months ago

Sure.

benhoyt commented 6 months ago

Could this proposal be added to the list of active proposals for review? It's well thought out and has lots of upvotes.

flibustenet commented 5 months ago

Link to "what would a Go 2 version of database/sql look like?" #22697

achille-roussel commented 5 months ago

@jba the iterator-based model you showed as example looks really similar to what I put together in this package a few months ago https://github.com/achille-roussel/sqlrange

In particular, the sqlrange.Scan function seems to combine what you are presenting with sql.ScanRow and iterators. The range function form addresses a lot of the performance concerns because the initialization cost can be amortized over all the rows being scanned. The added safety you mentioned of automatically closing the *sql.Rows is also a net benefit in my opinion.

I'm just sharing here because it seemed relevant to the discussion.

Thanks for driving this proposal!