Open veqryn opened 2 months ago
Right now the default for Jet is to let postgres arrays get generated into golang strings. You can override the generated types, but then you'd also have to implement your own scan/value functions too.
Check https://github.com/go-jet/jet/pull/380, this PR should add full array support.
Other scanning libraries such as scany and ksql provide an adapter that lets their library use both the standard library interface and also PGX directly.
I don't see how it is possible to have adapter that converts pgx
to standard database/sql
calls. For instance standard database connection has this method:
QueryContext(ctx context.Context, query string, args ...interface{}) (*sql.Rows, error)
All sql.Rows
fields are private and there is no public constructor. Thus no adapter is possible. The only way is to add a new set of methods just for pgx.
Also, note that pgx
does provide standard database interface - "github.com/jackc/pgx/v4/stdlib".
You'd have to look at scany and ksql to see how they do it. It might not be an adapter, it might be a copy of the implementation or something else.
At least for scany, they simply re-implement the entire service, so it's duplicated to some degree. They implement their own interface Rows{} and then have an implementation using sql.Rows and another one using pgx.Rows.
The interface itself is generic and has no reference to sql.Rows, which I don't believe is the case for Jet.
qrm.Queryable return sql.Rows, which would have to be modified to be a more generic object/iterator as a response rather than sql.db
@go-jet Is this possible in any capacity? We need it to get rid of database/sql.
Scany seem the way to go but I love this library and want to see it flourish.
@DevX86 Why do you want to get rid of database/sql?
Massive speed improvements, even selecting 15,000 records I saw a 3x improvement due to the binary transfer and native bindings.
I'm trying to refactor into scany, my friends we were spoiled by go-jet. I can't live without it, especially when it comes to complex queries. @go-jet @houten11 Is there honestly nothing we can do for native pgx? (no database/sql)
You actually can use them together, use go-jet to generate the sql and pgx to execute it Here's a simple example
selectStmt := SELECT(Table.AllColumns).FROM(Table).WHERE(Table.ID.EQ(1))).LIMIT(1)
func query[T any](db *pgxpool.Pool, ctx context.Context, stmt Statement) (*T, error) {
query, args := stmt.Sql()
row, _ := db.Query(ctx, query, args...)
data, err := pgx.CollectOneRow(row, pgx.RowToAddrOfStructByPos[T])
return data, err
}
# and just use it like this
data, err := query[Table](pool, ctx, selectStmt)
@body20002
I need something for complex structs like
type A struct {
B struct {
C
D E `db:"F"`
G H `db:"I"`
J []struct {
K
L struct {
M
N struct {
O `alias:"P"`
Q []struct {
R `alias:"S"`
}
}
} `db:"T"`
U []V `db:"W"`
} `db:"X"`
Y Z `db:"AA"`
AB AC `db:"AD"`
}
}
Worked on something like that for hours last night combining scany with go jet, I don't like the results. I flatten the struct then parse it like this.
var resultViews []views.ResultView
for _, tempGroup := range groupedTemps {
firstTemp := tempGroup[0]
// Extract embedded structs
modelA := utils.ExtractEmbeddedStruct[models.ModelA](firstTemp)
modelB := utils.ExtractEmbeddedStruct[models.ModelB](firstTemp)
modelC := utils.ExtractEmbeddedStruct[models.ModelC](firstTemp)
modelD := utils.ExtractEmbeddedStruct[models.ModelD](firstTemp)
modelE := utils.ExtractEmbeddedStruct[models.ModelE](firstTemp).Resolve()
// Collect items and build views
modelFItems := utils.ExtractEmbeddedSlice[Temp, models.ModelF](tempGroup)
modelGItems := utils.ExtractEmbeddedSlice[Temp, models.ModelG](tempGroup)
modelHItems := utils.ExtractEmbeddedSlice[Temp, models.ModelH](tempGroup)
modelIItems := utils.ExtractEmbeddedSlice[Temp, models.ModelI](tempGroup)
// Continue processing with the extracted data
// For example, building a result view:
resultView := views.NewResultView(
&modelA,
modelFItems,
&modelB,
&modelC,
modelE,
// ... other parameters
)
resultViews = append(resultViews, *resultView)
}
And this which has a struct plus slice, but we have structs into slices into structs etc. So this wont scale.
type ProjectData struct {
models.Project `db:"projects"`
ProjectRecords []models.ProjectRecord `db:"project_records" json:"project_records"`
}
func (r *ResearchRepository) GetFilteredProjectData(ctx context.Context, researcherID uuid.UUID, projectID *uuid.UUID) ([]views.ProjectData, error) {
ctx, span := researchTracer.Start(ctx, "ResearchRepository.GetFilteredProjectData")
defer span.End()
// SQL statement builder
stmt := sqlbuilder.
SELECT(
projectTable.AllColumns,
projectRecordTable.AllColumns).
FROM(researcherTable.
INNER_JOIN(projectTable, projectTable.ResearcherID.EQ(researcherTable.ID)).
INNER_JOIN(projectDetailTable, projectDetailTable.ProjectID.EQ(projectTable.ID)).
INNER_JOIN(projectTable, projectDetailTable.ProjectID.EQ(projectTable.ID)).
INNER_JOIN(projectRecordTable, projectRecordTable.ProjectID.EQ(projectTable.ID))).
WHERE(researcherTable.ID.EQ(sqlbuilder.UUID(researcherID)))
fmt.Println(stmt.DebugSql())
if projectID != nil {
stmt = stmt.WHERE(projectTable.ID.EQ(sqlbuilder.UUID(*projectID)))
}
// Define a temporary structure to hold results
type Temp struct {
models.Project `db:"projects"`
models.ProjectRecord `db:"project_records"`
}
var temps []Temp
if err := pgxscan.Select(ctx, r.db, &temps, stmt.DebugSql()); err != nil {
return nil, utils.SpanErr(span, err)
}
groupedTemps, err := utils.GroupByID(temps)
if err != nil {
return nil, err
}
var projectDataList []views.ProjectData
for _, tempGroup := range groupedTemps {
firstTemp := tempGroup[0]
projectData := views.ProjectData{
Project: utils.ExtractEmbeddedStruct[models.Project](firstTemp),
ProjectRecords: utils.ExtractEmbeddedSlice[Temp, models.ProjectRecord](tempGroup),
}
projectDataList = append(projectDataList, projectData)
}
return projectDataList, nil
}
@go-jet For reference, scany can't handle maps, or nested structs, only flat structs and flat slices.
@DevX86 Do you really need scany? As @body20002 pointed out in their example, since v5 pgx has nice support for scanning on its own (CollectRows
, CollectOneRow
, RowToStructByName
, RowToStructByNameLax
, etc).
reference: https://ectobit.com/blog/pgx-v5-3/
I used to use scany, and just converted to using plain pgx after v5 came out.
@DevX86 I haven't tried it on complex queries and certainty not a query that complex, but It should work with nested structs given the struct is correct or corresponds to the query return values
I think pgx docs states so
@dropwhile @body20002 Hate to ping you guys but I'm in the middle of a major refactor of our database layer. As long as it's pgx native I don't mind which I use (or something new) my main concern are these complex queries. Can you post an example of a nested struct with slices on it?
@DevX86 I'll look into `pgx' when I find some time. My intuition tells me that even if pgx is 3x faster than pq with database/sql in a real production environment, it will never make a significant difference. Because latency between user and server and server and database is measured in ms, and scanning in go structures is measured in ns. For example, if a scan takes 200 ns or 600 ns, it doesn't matter that much when the entire response to the user takes more than 10 ms (~200,000 times the difference).
Now, this is assuming that the database returns a reasonable amount of rows. What is a reasonable amount of rows depends on each setup, but a rough estimate is no more than 1000, preferably less than 100, for no more than 10 columns. If your queries return 15000 rows, it will take time to group that into the result regardless of the database driver used. By joining multiple tables in a single query, you can easily end up with a huge number of rows, in which case you'll need to consider pagination or even split the query into two or more.
@go-jet Thanks for the response, my normal queries including server response time are 2-3x faster. (Some around 10-30% but those add up) I used a 15k rows as an initial test. The joins are for our deeply nested data types, otherwise it ends up being a ton of requsets to the backend and constructing nested object out of many flat ones.
Also we get some cool things like better connection pooling. I can't decide if I should wait on the ideas ahead or continue with the refactor using the odd bridge I built, it's going to take a lot of man hours and I'm not sure about the maintainablity for other developers. For integration simplicity I also had to take the fields directly off the generated struct sso that's another pitfall to make this work 'seamlessly' with scany. I'm sure there's a way to patch it but I've put much headache and many hours into this, I'm just glad it's working at all.
We have ~1,000 queries in go-jet. Since day 1 of discovery I've loved this library and it's one of the few allowed into the codebase. If it's feasible please make this happen if you can @go-jet
I guess the bridge isn't horrendous with some renaming.
And changes like this for usage consistency.
// ExtractStruct extracts the first struct of type K from a slice of T.
func ExtractStruct[T any, K any](items []T) K {
if len(items) == 0 {
var zero K
return zero
}
// Extract the first item
firstItem := items[0]
// Use reflection to extract the embedded struct of type K
return ExtractEmbeddedStruct[K](firstItem)
}
The main thing I'm seeing in terms of the bridge implementation difficulty, is the mental coalescing of flat entities into a struct rather than thinking in terms of the end goal (here's the struct I need) . If anyone is interested I could do more battle testing on this and create a utility library for bridging jet-go and scany.
Here's the solution for scany implementation + works with encapsulated structs, if needed, modify your generator to include this.
// Tables.
UseTable(func(table metadata.Table) template.TableModel {
return template.DefaultTableModel(table).
UseField(func(columnMetaData metadata.Column) template.TableModelField {
defaultTableModelField := template.DefaultTableModelField(columnMetaData)
dbTag := columnMetaData.Name
return defaultTableModelField.UseTags(
fmt.Sprintf(`db:"%s"`, dbTag),
)
})
}).
Due to the issue below, a bridge is less ideal as we would have to make duplicate structs with nullable fields and some sort of Resolve()
function with jet as an embedded struct, that either returns the normal struct or nil. Currently doing that but this at scale won't be fun. I'm considering having go-jet generate all tables again with all fields as pointers as NullTableName
but that's a lot of duplication.
As nice as Scany and Ksql are, it would be much better if Jet supported PGX natively. Scany and Ksql have their own expectations of what the struct is supposed to look like. Jet creates its own struct table models, and it would be best placed to be the library scanning rows into them. It sounds like internally using your own interface, then accepting both sql.Rows and pgx.Rows, etc, would make them both work.
@veqryn Yeah scany's concept of scanning just isn't built for complexity like go-jet is. I just found a solution for the hopefully meantime which isn't so bad once you migrate a few queries. Native support would be awesome for this project and it honestly may attract some new people who are looking for an alternative solution or need to handle more complex structs.
So, I'm digging into the pgx vs db.sql API differences, there's a few stdlib packages that pgx exposes, I'm assuming that both:
stdlib.OpenDBFromPool(pgxpool.Pool) sqlx.NewDb(stdlib.OpenDBFromPool(instance.dbPool), "pgx") returns DB which includes extends sql.DB.
are not sufficient for your usecase? I think both of these return a sql.DB equivalent that can be used with Jet.
@safaci2000 Admittedly I'm not smart, are you saying the native version of pgx can be used with go-jet? They have a database/sql version and a native version which is much more performant.
@safaci2000 Admittedly I'm not smart, are you saying the native version of pgx can be used with jetgo? They have a database/sql version and a native version which is much more performant.
I'm not sure. My point that it seems like there's a pattern you can use that may use the pgx driver. Since you seen very motivated I'm using that driver I was pointing out something to try out to see if you get the performance you need.
If you just need arrays, then this hack below might work for now. Note that this hinges on using pgx under the hood, but doing it via the database/sql
interface.
// Array represents a postgres array containing elements of type T.
// We need this Scan implementation below because pgx is kinda awkward when used via database/sql; see e.g.
// https://github.com/jackc/pgx/issues/1458
// https://github.com/jackc/pgx/issues/1662
// https://github.com/jackc/pgx/issues/1779
type Array[T any] []T
func (a *Array[T]) Scan(src any) error {
// implementation from https://github.com/jackc/pgx/issues/1779#issuecomment-1787125955
m := pgtype.NewMap()
v := (*[]T)(a)
t, ok := m.TypeForValue(v)
if !ok {
return fmt.Errorf("cannot convert to sql.Scanner: cannot find registered type for %T", a)
}
var bufSrc []byte
if src != nil {
switch src := src.(type) {
case string:
bufSrc = []byte(src)
case []byte:
bufSrc = src
default:
bufSrc = []byte(fmt.Sprint(bufSrc))
}
}
return m.Scan(t.OID, pgtype.TextFormatCode, bufSrc, v)
}
In future versions of golang, this should no longer be necessary because this proposal was accepted (although the implementation hasn't been merged yet): https://github.com/golang/go/issues/67546
@go-jet Is there anything I can do to help with this feature even if it's just research?
@go-jet Is there anything I can do to help with this feature even if it's just research?
Have you tried the pattern I gave you last week. You seem concerned about performance, try using jet with those patterns and see if the performance is comparable to your expectations for pure PGX
@go-jet Is there anything I can do to help with this feature even if it's just research?
Have you tried the pattern I gave you last week. You seem concerned about performance, try using jet with those patterns and see if the performance is comparable to your expectations for pure PGX
Not sure how to approach that, can you explain in more detail?
Not sure how to approach that, can you explain in more detail?
Well, to quote an earlier message from you:
Massive speed improvements, even selecting 15,000 records I saw a 3x improvement due to the binary transfer and native bindings.
Re-run that test and see if you can get the same performance.
Just have something along these lines in the code.
dbPool, err := pgxpool.New(ctx, dbURI)
if err != nil {
log.Fatal().Err(err).Msgf("Unable to create a pgx pool")
}
db := stdlib.OpenDBFromPool(dbPool)
stmt.Query(conn, &captureModel) //use jet to invoke query
Obviously you should save the dbPool and not create a new one on every call.
Not sure how to approach that, can you explain in more detail?
Well, to quote an earlier message from you:
Massive speed improvements, even selecting 15,000 records I saw a 3x improvement due to the binary transfer and native bindings.
Re-run that test and see if you can get the same performance.
Just have something along these lines in the code.
dbPool, err := pgxpool.New(ctx, dbURI) if err != nil { log.Fatal().Err(err).Msgf("Unable to create a pgx pool") } db := stdlib.OpenDBFromPool(dbPool) stmt.Query(conn, &captureModel) //use jet to invoke query
Obviously you should save the dbPool and not create a new one on every call.
I was excited for this solution but it's half the speed as native pgx. (response took 2x as long)
@go-jet Any findings on this?
No, I haven’t had the opportunity to look into this yet.
@DevX86 In the meantime you can check Prepared Statements Caching, it might give you some performance improvements.
I never saw this, this is pretty cool. I've migrated over 100 queries to go-jet + pgx + scany. Using this for anything unmigrated, thank you! @go-jet
Is your feature request related to a problem? Please describe. Right now the default for Jet is to let postgres arrays get generated into golang strings. You can override the generated types, but then you'd also have to implement your own scan/value functions too. Using PGX directly (
github.com/jackc/pgx/v5
) lets us do things like scanning postgres arrays directly to a slice ([]int64
), and many other great features. Unfortunately, Jet only accepts the standard library database/sql interface, which forces us to usegithub.com/jackc/pgx/v5/stdlib
Describe the solution you'd like Other scanning libraries such as
scany
andksql
provide an adapter that lets their library use both the standard library interface and also PGX directly. It would be great if Jet provided similar.