Closed 0xjac closed 2 years ago
Great write up! Imports have been one of the more difficult things to get right. That and composite types.
I can repro. I’ll take a look in the next couple days.
As a side note, (tho this might be worth a separate issue if you think it's worth it)
Yep, please open another issue so it’s easy for folks to vote on. If you can, add some context on the use case where it’s helpful to have a query specific types. I place a high bar on additional options.
@jschaf thanks for the reply. I started writing a tool similar to pggen
a long time ago but never got around to have a working mvp. In any case imports are tricky and there is one more issue with them which I failed to showcase in my example and it is that of packages whose package name is not matching the end of the module import path. This happens mostly in two cases:
pgx
which is imported as github.com/jackc/pgx/v4
but the package name is still pgx
, not v4
.go-prefix
or a vanity URL, like github.com/pelletier/go-toml
which has as package name toml
or gopkg.in/yaml.v2
which is named yaml
I am not sure how to resolve the package name from the file, there might be some way finding the install path of the package on the local machine, finding the go files and parsing the package
line on the source files. But that's complicated...
Here is the alternative I came up with at the time, which is to specify both the import path and package names as separate values. Adapted to the way arguments are mapped in pggen
, it could look something like this:
package main
import (
"fmt"
"regexp"
)
const (
// Group 0 is the entire match (i.e. the entire string)
groupSQLType = 1 // Name of the SQL type as detected by pggen.
// Group 2 is the import group with the trailing comma to separate it from the package and type.
groupImport = 3 // Import line required by the mapped Go type.
groupGoType = 4 // Go type with, including the slice/array/pointer marker and package name.
// Group 5 is the slice/array/pointer marker.
groupPkgName = 6 // Go package name with a trailing dot.
// Group 7 is the type name without the package.
)
const (
// sqlTypeRE matches an SQL identifier as describe in
// https://www.postgresql.org/docs/8.1/sql-syntax.html#SQL-SYNTAX-IDENTIFIERS
sqlTypeRE = `^(?P<sqlType>[A-Za-z_][\w\$]*)`
// importRE matches an import spec which can be any unicode char
// See: https://golang.org/ref/spec#Import_declarations
importRE = `(?P<import>.+)`
// markerRE matches the "marker" of either:
// - a slice ([])
// - an array with a size as number literal ([<number>])
// - a pointer (*)
// - a slice of pointers ([]*)
// - an array with a size as number literal of pointers ([<number>]*).
markerRE = `(\[\d*\]\*|\[\d*\]|\*)?`
// pkgRE matches a Go pkg name as defined in https://golang.org/ref/spec#Package_clause.
pkgRE = `([\pL_][\pL\pN_]*\.)?`
// gType matches an optional slice and/or pointer marker, an optional pkg name, and a
goTypeRE = `(?P<gotype>` + markerRE + pkgRE + `([\pL_][\pL\pN_]*))`
)
var argRE = regexp.MustCompile(sqlTypeRE + `=(` + importRE + `,)?` + goTypeRE)
func main() {
args := []string{
// base type
// (backwards compatible with --go-type in pggen version 2021-09-07, commit 7f87888)
"text=string", // plain type
"text=*string", // pointer
"_text=[]string", // slice
"_text=[42]string", // array
"_text=[]*string", // slice of b pointer
"_text=[42]*string", // array of pointer
// type from stdlib
"timestamp=time,time.Time", // plain type
"timestamp=time,*time.Time", // pointer
"_timestamp=time,[]time.Time", // slice
"_timestamp=time,[42]time.Time", // array
"_timestamp=time,[]*time.Time", // slice of b pointer
"_timestamp=time,[42]*time.Time", // array of pointer
// local package type
"myclass=local/util/pkg,pkg.Class", // plain type
"myclass=local/util/pkg,*pkg.Class", // pointer
"_myclass=local/util/pkg,[]pkg.Class", // slice
"_myclass=local/util/pkg,[42]pkg.Class", // array
"_myclass=local/util/pkg,[]*pkg.Class", // slice of b pointer
"_myclass=local/util/pkg,[42]*pkg.Class", // array of pointer
// module type
"text=github.com/jschaf/pggen,pggen.Lang", // plain type
"text=github.com/jschaf/pggen,*pggen.Lang", // pointer
"_text=github.com/jschaf/pggen,[]pggen.Lang", // slice
"_text=github.com/jschaf/pggen,[42]pggen.Lang", // array
"_text=github.com/jschaf/pggen,[]*pggen.Lang", // slice of b pointer
"_text=github.com/jschaf/pggen,[42]*pggen.Lang", // array of pointer
// module type with version above 1
"row=github.com/jackc/pgx/v4,pgx.Row", // plain type
"row=github.com/jackc/pgx/v4,*pgx.Row", // pointer
"_row=github.com/jackc/pgx/v4,[]pgx.Row", // slice
"_row=github.com/jackc/pgx/v4,[42]pgx.Row", // array
"_row=github.com/jackc/pgx/v4,[]*pgx.Row", // slice of b pointer
"_row=github.com/jackc/pgx/v4,[42]*pgx.Row", // array of pointer
// module/pkg names mismatch
"tomlpos=github.com/pelletier/go-toml,toml.Position", // plain type
"tomlpos=github.com/pelletier/go-toml,*toml.Position", // pointer
"_tomlpos=github.com/pelletier/go-toml,[]toml.Position", // slice
"_tomlpos=github.com/pelletier/go-toml,[42]toml.Position", // array
"_tomlpos=github.com/pelletier/go-toml,[]*toml.Position", // slice of b pointer
"_tomlpos=github.com/pelletier/go-toml,[42]*toml.Position", // array of pointer
// type from stdlib
"timestamp=time,time.Time", // plain type
"timestamp=time,*time.Time", // pointer
"_timestamp=time,[]time.Time", // slice
"_timestamp=time,[42]time.Time", // array
"_timestamp=time,[]*time.Time", // slice of b pointer
"_timestamp=time,[42]*time.Time", // array of pointer
// invalid import for base type
// (backwards compatible with --go-type in pggen version 2021-09-07, commit 7f87888)
"text=strings,string", // plain type
"text=strings,*string", // pointer
"_text=strings,[]string", // slice
"_text=strings,[42]string", // array
"_text=strings,[]*string", // slice of b pointer
"_text=strings,[42]*string", // array of pointer
// missing import
// local package type
"myclass=pkg.Class", // plain type
"myclass=*pkg.Class", // pointer
"_myclass=[]pkg.Class", // slice
"_myclass=[42]pkg.Class", // array
"_myclass=[]*pkg.Class", // slice of b pointer
"_myclass=[42]*pkg.Class", // array of pointer
// module type
"text=pggen.Lang", // plain type
"text=*pggen.Lang", // pointer
"_text=[]pggen.Lang", // slice
"_text=[42]pggen.Lang", // array
"_text=[]*pggen.Lang", // slice of b pointer
"_text=[42]*pggen.Lang", // array of pointer
// module type with version above 1
"row=pgx.Row", // plain type
"row=*pgx.Row", // pointer
"_row=[]pgx.Row", // slice
"_row=[42]pgx.Row", // array
"_row=[]*pgx.Row", // slice of b pointer
"_row=[42]*pgx.Row", // array of pointer
// module/pkg names mismatch
"tomlpos=toml.Position", // plain type
"tomlpos=*toml.Position", // pointer
"_tomlpos=[]toml.Position", // slice
"_tomlpos=[42]toml.Position", // array
"_tomlpos=[]*toml.Position", // slice of b pointer
"_tomlpos=[42]*toml.Position", // array of pointer
}
var (
matches []string
sqlType, importLine, pkgName, goType string
)
for _, arg := range args {
matches = argRE.FindStringSubmatch(arg)
if len(matches) != 8 {
fmt.Printf("invalid go type mapping: %s\n", arg)
continue
}
sqlType = matches[groupSQLType]
importLine = matches[groupImport]
pkgName = matches[groupPkgName]
goType = matches[groupGoType]
if pkgName == "" && importLine != "" {
fmt.Printf("invalid import: %s for type %s (%s)\n", importLine, goType, arg)
continue
}
// Detect missing import statements for a given package.
if pkgName != "" && importLine == "" {
// Print [:len(pkgName)-1] of the package name since the last character is always the
// period (.) between the package name and value (if the package is present).
fmt.Printf("missing import for package: %s (%s)\n", pkgName[:len(pkgName)-1], arg)
continue
// Note, one alternative is to build a map of package name to import line.
// For any package name as the key, set the import line as the value.
// - If the package name is not in the mapping, add it, even if the import line is
// missing (empty string).
// - If the package name is already present with a missing import line,
// update the import line
// - If the package name is already present, but the import lines differ, return an
// error since there are two identical package names. (The only way to handle this
// would be to support specifying aliases for package names, potentially by modifying
// the regexp).
}
fmt.Printf("%s -> %s", sqlType, goType)
if importLine != "" {
fmt.Printf(" (import %s)", importLine)
}
fmt.Println()
}
}
This is a bit long because of the test cases, but the interesting bits are the regexp at the top and how it is used at the bottom.
The regexp builds on the existing format of pggen
, namely: <sql-type>=<go-type>
to support an optional import line and markers for the slice/array/pointers. It looks like <sql-type>=[<import>,]<go-type>
The go-type
is further split into:
[<marker>][<package>.]<declared-go-type>
. I wrote it this way because I did not managed to have optional groups without actually having groups, and to allow to extract the package
and check that if an import is specified, the type comes from a package and vice-versa.
Some drawbacks of this approach:
map
as explained in the snippet above. This remove the repetition but if the type mapping including the import is removed, the generation will/should fail.text
to int
, the query might actually fail at runtime!).The other solution I thought of at the time was to only specify the go type with the package and then provide a separate list of required packages to import. However I decided against it as I felt it was an inferior approach for the following reasons:
A bit of a mouthful, but since I actually spent some time on this, I hope it can help save you some time :wink:
I'll open a separate issue for the type mapping per query. And the suggest approach and regex is valid and can be reused for the type mapping per query parameter by reusing "(" + importRE + ",)?" + goTypeRE
:wink:
@jschaf another issue, specific to pggen, which I just noticed is that the file containing the Querier
interface must import all modules even those of queries in other files since the Querier
interface contains the signature of all queries.
Hopefully it should be as simple as going over the entire map and adding all the imports for that file, but it's something to keep in mind :wink:
@jschaf Thank you very much for the fix and for releasing it in 2022-05-22
. I'm currently updating the [AUR package]. However I noticed there are no assets (except the source code) in the the latest release.
Could I trouble you to release the binaries with the fix? Thanks a lot.
The action releaser is all screwed up but I released manually and it worked.
I saw that, thanks I'll test it tonight.
Mapping an SQL type to a pointer and/or a slice of a go type which must be imported results in invalid generated go code as
pggen
is confused by the slice and pointer markers ([]
and*
).Minimally reproducible example
pggen
VersionGen Command
Queries
Generated File
Expected
The two main issues are the miss-handling of the
[]
and*
markers and the failure to correctly map the return values for:"_timestamptz=[]time.Time"
(GetTimes
)"_interval=[]util/custom/times.Interval"
(GetCustomIntervals
)"_date=util/custom/[]times.Date"
(GetCustomDates
)As a side note, (tho this might be worth a separate issue if you think it's worth it) I used different SQL types for each queries as example since I cannot map a single SQL type to multiple Go types. It would be nice to be able to further specify a type mapping per query argument and for the return value. I.e. I can map overall
timestamp=time.Time
but for a query I could add a comment to the query to map a specific argument to a different type, like: