sqlc-dev / sqlc

Generate type-safe code from SQL
https://sqlc.dev
MIT License
12.28k stars 778 forks source link

sqlc.embed() generates code that cannot handle NULL items #2348

Open PatrLind opened 1 year ago

PatrLind commented 1 year ago

Version

1.18.0

What happened?

I was testing to implement sqlc.embed() in order to simplify my conversions routines from sqlc to protobuf.

I have this query:

-- name: GetChannelLayers :many
SELECT sqlc.embed(t)
FROM unnest(sqlc.narg('channel_layer_ids')::TEXT[]) WITH ORDINALITY o(channel_layer_id, ord)
LEFT JOIN channel_layer AS t ON t.channel_layer_id = o.channel_layer_id::UUID AND t.tenant_id = $1;

The input is a list of IDs that i expect to get from the DB. If the ID is not found I expect to get a NULL row. Unfortunately it seems like the code for the new sqlc.embed() feature doesn't handle this case very well since I get a scan error if I try to query for a non-existent ID.

pgx.ScanArgError: can't scan into dest[0]: cannot scan null into *string

The generated output code:


const getChannelLayers = `-- name: GetChannelLayers :many
SELECT t.tenant_id, t.channel_id, t.channel_layer_id, t.name, t.description, t.created, t.updated, t.removed, t.disabled, t.condition_ref, t.z_index, t.width, t.height, t.pos_x, t.pos_y, t.alignment, t.content_ref
FROM unnest($2::TEXT[]) WITH ORDINALITY o(channel_layer_id, ord)
LEFT JOIN channel_layer AS t ON t.channel_layer_id = o.channel_layer_id::UUID AND t.tenant_id = $1
`

type GetChannelLayersParams struct {
    TenantID        string
    ChannelLayerIds []string
}

type ChannelLayer struct {
    TenantID       string
    ChannelID      uuid.UUID
    ChannelLayerID uuid.UUID
    Name           string
    Description    sql.NullString
    Created        time.Time
    Updated        time.Time
    Removed        sql.NullTime
    Disabled       sql.NullTime
    ConditionRef   sql.NullString
    ZIndex         sql.NullInt32
    Width          sql.NullInt32
    Height         sql.NullInt32
    PosX           sql.NullInt32
    PosY           sql.NullInt32
    Alignment      ChannelLayerAlignment
    ContentRef     sql.NullString
}

type GetChannelLayersRow struct {
    ChannelLayer ChannelLayer
}

func (q *Queries) GetChannelLayers(ctx context.Context, arg GetChannelLayersParams) ([]*GetChannelLayersRow, error) {
    rows, err := q.db.Query(ctx, getChannelLayers, arg.TenantID, arg.ChannelLayerIds)
    if err != nil {
        return nil, err
    }
    defer rows.Close()
    var items []*GetChannelLayersRow
    for rows.Next() {
        var i GetChannelLayersRow
        if err := rows.Scan(
            &i.ChannelLayer.TenantID,
            &i.ChannelLayer.ChannelID,
            &i.ChannelLayer.ChannelLayerID,
            &i.ChannelLayer.Name,
            &i.ChannelLayer.Description,
            &i.ChannelLayer.Created,
            &i.ChannelLayer.Updated,
            &i.ChannelLayer.Removed,
            &i.ChannelLayer.Disabled,
            &i.ChannelLayer.ConditionRef,
            &i.ChannelLayer.ZIndex,
            &i.ChannelLayer.Width,
            &i.ChannelLayer.Height,
            &i.ChannelLayer.PosX,
            &i.ChannelLayer.PosY,
            &i.ChannelLayer.Alignment,
            &i.ChannelLayer.ContentRef,
        ); err != nil {
            return nil, err
        }
        items = append(items, &i)
    }
    if err := rows.Err(); err != nil {
        return nil, err
    }
    return items, nil
}

I'm actually not sure how to solve this. I need the LEFT JOIN because reasons... Perhaps it's impossible to support this case?

Perhaps sqlc can create a temp/hidden type where all fields are NULL-able. Scan the fields using this type and then check if the PRIMARY KEY fields are Valid. If they are valid, convert the type with the NULL-able fields into the normal type. If the fields are not valid then return a nil object pointer for the embedded field.

@nickjackson, any input?

What operating system are you using?

Linux

What database engines are you using?

PostgreSQL

What type of code are you generating?

Go

richchurcher commented 1 year ago

Playground reproduction: https://play.sqlc.dev/p/f7baf03ee43839b8063555ea33f6d94c2f9f8545e14cb7f41406f880eea432d7

I may actually try to fix this, as it also affects my planned use for embed. Looking at the source, I'm still not quite sure how much work is involved making it respect nullable columns. I'll have a think about it.

nickjackson commented 1 year ago

Hi Both,

Left joins were left out of the implementation because the scanning logic is not so straightforward. Passing the additional context through about a left-join column is also tricky.

Perhaps sqlc can create a temp/hidden type where all fields are NULL-able. Scan the fields using this type and then check if the PRIMARY KEY fields are Valid. If they are valid, convert the type with the NULL-able fields into the normal type. If the fields are not valid then return a nil object pointer for the embedded field.

I think most of this idea has legs (temp type with nullable fields, nil object pointer, seem 👍 ), but there are two issues I can see.

  1. While I'm sure 99% of the time the PRIMARY KEY is good enough, what about the cases where a user wants to check a different field?
  2. What if the user is using a custom scanning type for the primary key field that doesn't have a Valid attribute or none at all? If done incorrectly, the user will end up with a broken package.

Both of those options are solvable, perhaps with config overrides, but it's worth pointing out. This feature requires sqlc to make business logic decisions which it typically stays out of.

I would love to see this implemented, just to make sqlc.embed feature complete. I might even use it, but right now I just work around this by making an additional query in parallel.

I hope this helps.

richchurcher commented 1 year ago

Yeah, sometimes it's a question of "where do you stop?" with a feature like this I guess. :thinking:

It'd be tempting to explicitly provide the support if the primary key is available, and bail out to the standard behaviour if not. This would surely increase its utility quite a bit. However, I do wonder if there's an opportunity for the generator logic to be a bit more well-informed about which joins could result in a null embed even if we're not joining on a primary key (for example, always using a nil pointer for a left join).

Alternatively, could there be room for a sqlc.embed(foo, sqlc.nullable) variant? Which would make it declarative, and put the onus on the developer to request it.

PatrLind commented 1 year ago

Good points nickjackson. I also think richchurcher has a point on the ability to configure the embed() command with parameters.

I think a good start could be to write the code such that it is able to detect the cases it cannot support. Right now the code generated doesn't really work if you happen to do unsupported things. If it is possible to detect all the cases it doesn't work, I think it would be easier to extend it to actually make it work, since then we know all about what type of join it is and so forth. I took a look at the code, and sqlc doesn't seem to know very much about the join itself, just that there is some kind of relationship to another table.

nickjackson commented 1 year ago

Yeah, I have to be honest the implementation (and the whole of sqlc to some extent) is very naive about what a join is - it is just sugar on column selections.

Happy to answer questions and help out where I can, but I don't have the bandwidth to do this myself, nor do I have any authority to suggest the right course of action. (I'm not a maintainer)

richchurcher commented 1 year ago

I'll have a quiet look at what it would take to go down the parameter path... it feels like it would just be a question of choosing a pointer type, but it can't possibly be that simple :laughing:

richchurcher commented 1 year ago

I think, having done more poking around, that even if piggybacking on embed might be slightly easier, introducing nembed (nullable embed) is more in line with the existence of arg/narg. From the looks of it, it'll require a bit of a refactor but maybe I can learn from @skabbes' hard work in #1536 .

richchurcher commented 1 year ago

@nickjackson If you did happen to have a moment, could you glance at #2472 and see if you think a more polished version of this might work?

andrewmbenton commented 1 year ago

I think I understand the motivation for this after reading, but based on the current PR it's going to require a lot of changes including changes to templates so I'm not sure if it's likely to get merged.

Just to clarify, would you say the "workaround" for this is to abandon use of sqlc.embed()? I'm not even sure that a SELECT * ... would work in this case...

richchurcher commented 1 year ago

Oh I don't think it'll get merged either, but I'm glad to have talked about the issue more :) At the moment, this is my workaround for nullable joins with embed:

-- name: GetUserRelationships :one
select
  sqlc.embed(u),
  sqlc.embed(b),
  i.id as img_id,
  i.object_key as img_object_key, 
  i.created_at as img_created_at,
  l.id as loc_id,
  l.code as loc_code,
  l.english_name as loc_english_name,
  l.te_reo_name as loc_te_reo_name
  from users u
  left join buckets b on u.bucket_id = b.id
  left join images i on u.avatar_id = i.id
  left join locations l on u.location_id = l.id
  where u.deleted_at is null
    and u.id = $1
  limit 1;

Being consistent with that naming scheme lets me copy the values into structs in a relatively pain-free fashion. It's work, but it's not too much work, at least for now.

KevenGoncalves commented 1 year ago

Oh I don't think it'll get merged either, but I'm glad to have talked about the issue more :) At the moment, this is my workaround for nullable joins with embed:

-- name: GetUserRelationships :one
select
  sqlc.embed(u),
  sqlc.embed(b),
  i.id as img_id,
  i.object_key as img_object_key, 
  i.created_at as img_created_at,
  l.id as loc_id,
  l.code as loc_code,
  l.english_name as loc_english_name,
  l.te_reo_name as loc_te_reo_name
  from users u
  left join buckets b on u.bucket_id = b.id
  left join images i on u.avatar_id = i.id
  left join locations l on u.location_id = l.id
  where u.deleted_at is null
    and u.id = $1
  limit 1;

Being consistent with that naming scheme lets me copy the values into structs in a relatively pain-free fashion. It's work, but it's not too much work, at least for now.

Unfortunately even with that I still get the Scan error, so I was forced to write two queries, one with the possible null relations and other without it, but is quite painful

-- name: ListChargePaginate :many
SELECT
  sqlc.embed(c),
  sqlc.embed(cl),
  sqlc.embed(t),
  sqlc.embed(e),
  sqlc.embed(ctype),
  sqlc.embed(cstate),
  sqlc.embed(ctime),
  sqlc.embed(p)
FROM charge c
LEFT JOIN client cl ON c.client_id = cl.id
LEFT JOIN transport t ON c.transport_id = t.id
LEFT JOIN employee e ON c.manager_id = e.id
LEFT JOIN charge_type ctype ON c.charge_type_id = ctype.id
LEFT JOIN charge_state cstate ON c.state_id = cstate.id
LEFT JOIN charge_time ctime ON c.type_id = ctime.id
LEFT JOIN payment_method p ON c.payment_method_id = p.id
WHERE c.deleted_at IS NULL
ORDER BY c.created_at DESC
OFFSET $1 LIMIT $2;

In this case the relations that can be null is transport and manager

richchurcher commented 1 year ago

Yeah, you won't be able to use embed for t and e, you need to handle 'em as columns and do some post-processing after the query.

hectorj-thetreep commented 11 months ago

Workaround: if you add a view with just your LEFT JOIN, you can JOIN it and sqlc will generate the model with all nullable fields. See https://play.sqlc.dev/p/1f63b8bc9aac38f8be54ce6b4842b693ed259cf7a4807feaf55aebd1cc8c886a

CREATE VIEW authorimages AS (
  SELECT images.* FROM authors LEFT JOIN images ON authors.id = images.id
);

-- name: GetAuthor :one
SELECT sqlc.embed(authors), sqlc.embed(authorimages) FROM authors
JOIN authorimages on authors.id = authorimages.id
WHERE authors.id = $1 LIMIT 1;
type Authorimage struct {
    ID  sql.NullInt64
    Url sql.NullString
}
kevlarr commented 10 months ago

Workaround: if you add a view with just your LEFT JOIN, you can JOIN it and sqlc will generate the model with all nullable fields.

You don't even need to join inside that view, do you? Ie. create view images_vw as (select * from images)?

Either way that's a clever workaround that I'm probably going to rely on for now, I just wish it didn't introduce a model with all nullable fields in our domain that only technically accommodates for only null or a model with all non-nullable fields. Dreaming of sqlc.nembed.

KevenGoncalves commented 10 months ago

Workaround: if you add a view with just your LEFT JOIN, you can JOIN it and sqlc will generate the model with all nullable fields. See https://play.sqlc.dev/p/1f63b8bc9aac38f8be54ce6b4842b693ed259cf7a4807feaf55aebd1cc8c886a

CREATE VIEW authorimages AS (
  SELECT images.* FROM authors LEFT JOIN images ON authors.id = images.id
);

-- name: GetAuthor :one
SELECT sqlc.embed(authors), sqlc.embed(authorimages) FROM authors
JOIN authorimages on authors.id = authorimages.id
WHERE authors.id = $1 LIMIT 1;
type Authorimage struct {
  ID  sql.NullInt64
  Url sql.NullString
}

Thanks Man, I never thought about that, I didn’t know that a view generate Null fields.

hungtcs commented 2 months ago

Hey, I'm stumped by this. In my code, I have to use left join to query the owner(nullable) of the resource, In order to avoid errors, I now have to write out all columns in the select with owner_ prefix. but that's a lot, the owner table could have hundreds of columns.

Do you have a better way? Thanks.

KevenGoncalves commented 2 months ago

Hey, I'm stumped by this. In my code, I have to use left join to query the owner(nullable) of the resource, In order to avoid errors, I now have to write out all columns in the select with owner_ prefix. but that's a lot, the owner table could have hundreds of columns.

Do you have a better way? Thanks.

You can use views to handle that. see #2997

hungtcs commented 2 months ago

@KevenGoncalves I just saw this answer but didn't understand the principle, now I understand, it looks quite feasible, I just need to check whether the primary key is nil or not. Thanks a lot.

sekulicd commented 1 month ago

@KevenGoncalves does workaround with views makes sense with multiple tables?

This is my query:

-- name: SelectRoundWithRoundId :many
SELECT sqlc.embed(round), sqlc.embed(payment), sqlc.embed(tx), sqlc.embed(receiver), sqlc.embed(vtxo)
FROM round
     LEFT OUTER JOIN payment ON round.id=payment.round_id
     LEFT OUTER JOIN tx ON round.id=tx.round_id
     LEFT OUTER JOIN receiver ON payment.id=receiver.payment_id
     LEFT OUTER JOIN vtxo ON payment.id=vtxo.payment_id
WHERE round.id = ?;

Here is schema:

CREATE TABLE IF NOT EXISTS round (
    id TEXT PRIMARY KEY,
    starting_timestamp INTEGER NOT NULL,
    ending_timestamp INTEGER NOT NULL,
    ended BOOLEAN NOT NULL,
    failed BOOLEAN NOT NULL,
    stage_code INTEGER NOT NULL,
    txid TEXT NOT NULL,
    unsigned_tx TEXT NOT NULL,
    connector_address TEXT NOT NULL,
    dust_amount INTEGER NOT NULL,
    version INTEGER NOT NULL,
    swept BOOLEAN NOT NULL
);

CREATE TABLE IF NOT EXISTS payment (
    id TEXT PRIMARY KEY,
    round_id TEXT NOT NULL,
    FOREIGN KEY (round_id) REFERENCES round(id)
);

CREATE TABLE IF NOT EXISTS receiver (
    payment_id TEXT NOT NULL,
    pubkey TEXT NOT NULL,
    amount INTEGER NOT NULL,
    onchain_address TEXT NOT NULL,
    FOREIGN KEY (payment_id) REFERENCES payment(id),
    PRIMARY KEY (payment_id, pubkey)
);

CREATE TABLE IF NOT EXISTS tx (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    tx TEXT NOT NULL,
    round_id TEXT NOT NULL,
    type TEXT NOT NULL,
    position INTEGER NOT NULL,
    txid TEXT,
    tree_level INTEGER,
    parent_txid TEXT,
    is_leaf BOOLEAN,
    FOREIGN KEY (round_id) REFERENCES round(id)
);

CREATE TABLE IF NOT EXISTS vtxo (
    txid TEXT NOT NULL PRIMARY KEY,
    vout INTEGER NOT NULL,
    pubkey TEXT NOT NULL,
    amount INTEGER NOT NULL,
    pool_tx TEXT NOT NULL,
    spent_by TEXT NOT NULL,
    spent BOOLEAN NOT NULL,
    redeemed BOOLEAN NOT NULL,
    swept BOOLEAN NOT NULL,
    expire_at INTEGER NOT NULL,
    payment_id TEXT,
    FOREIGN KEY (payment_id) REFERENCES payment(id)
);
sekulicd commented 1 month ago

@KevenGoncalves does workaround with views makes sense with multiple tables?

Update: Bellow worked for me:

CREATE VIEW round_payment_vw AS SELECT payment.*
FROM round
LEFT OUTER JOIN payment
ON round.id=payment.round_id;

CREATE VIEW round_tx_vw AS SELECT tx.*
FROM round
LEFT OUTER JOIN tx
ON round.id=tx.round_id;

CREATE VIEW payment_receiver_vw AS SELECT receiver.*
FROM payment
LEFT OUTER JOIN receiver
ON payment.id=receiver.payment_id;

CREATE VIEW payment_vtxo_vw AS SELECT vtxo.*
FROM payment
LEFT OUTER JOIN vtxo
ON payment.id=vtxo.payment_id;

SELECT sqlc.embed(round),
       sqlc.embed(round_payment_vw),
       sqlc.embed(round_tx_vw),
       sqlc.embed(payment_receiver_vw),
       sqlc.embed(payment_vtxo_vw)
FROM round
         LEFT OUTER JOIN round_payment_vw ON round.id=round_payment_vw.round_id
         LEFT OUTER JOIN round_tx_vw ON round.id=round_tx_vw.round_id
         LEFT OUTER JOIN payment_receiver_vw ON round_payment_vw.id=payment_receiver_vw.payment_id
         LEFT OUTER JOIN payment_vtxo_vw ON round_payment_vw.id=payment_vtxo_vw.payment_id
WHERE round.id = ?;