full_db_type in postgres

vincentserpoul commented 5 years ago

First of all, let me say I love sqlboiler!

Then, I've been using it for a few projects with MySQL and great results. I've decided to see if I can switch to postgres and then came an issue.

At generation, I'm using the full_db_type as a type.match, and it allows me to be quite precise on which field should be casted to what. For example, I always use github.com/rs/xid as a primary key (id) in most of my tables. I store this either as BINARY(12) or CHAR(20) in mysql, and the match is easy to do with full_db_type char(20) > xid.ID

When I switched to postgresql, I realized the full_db_type is always empty, and the db_type doesn't include the max length of the character, and then it's impossible to differentiate the id char(20) from the other string ids (for example another country code table, with id as string).

I tried to use the default value as type.match, but it's not taken into account. I also tried to generate a alternative "full_db_type" with udt_name and character_max_length but at this point of time, before submitting any PR, I'd like to have an opinion.

Specifically for psql: would you accept a PR to match on the column default value? would you accept a full_db_type with udt_name and char length?

@aarondl

aarondl commented 5 years ago

I would not accept a PR to match on default values because they're too insane. Database default values can go so far as to be function calls and I think it's just bad form.

The changes you're suggesting to full_db_type and udt_name are okay though. Can you elaborate how you might change each one?

vincentserpoul commented 5 years ago

After looking more into the source code, I'm also thinking of a third way: add a specific field CharacterMaximumLength to the Column struct, only used in postgres and allowing matching on it.

For the FullDBType solution, I would change the query to :

create table testch(
  id char(20),
  notid integer
);

select
        c.column_name,
        column_type,
        (
            case when c.character_maximum_length != 0
            then
            (
                column_type || '(' || c.character_maximum_length || ')'
            )
            else ''
            end
        ) as column_full_type,
        c.udt_name,
        e.data_type as array_type,
        c.column_default,
        c.is_nullable = 'YES' as is_nullable,
        (case
            when (select
            case
                when column_name = 'is_identity' then (select c.is_identity = 'YES' as is_identity)
            else
                false
            end as is_identity from information_schema.columns
            WHERE table_schema='information_schema' and table_name='columns' and column_name='is_identity') IS NULL then 'NO' else is_identity end) = 'YES' as is_identity,
        (select exists(
            select 1
            from information_schema.table_constraints tc
            inner join information_schema.constraint_column_usage as ccu on tc.constraint_name = ccu.constraint_name
            where tc.table_schema = 'public' and tc.constraint_type = 'UNIQUE' and ccu.constraint_schema = 'public' and ccu.table_name = c.table_name and ccu.column_name = c.column_name and
                (select count(*) from information_schema.constraint_column_usage where constraint_schema = 'public' and constraint_name = tc.constraint_name) = 1
        )) OR
        (select exists(
            select 1
            from pg_indexes pgix
            inner join pg_class pgc on pgix.indexname = pgc.relname and pgc.relkind = 'i' and pgc.relnatts = 1
            inner join pg_index pgi on pgi.indexrelid = pgc.oid
            inner join pg_attribute pga on pga.attrelid = pgi.indrelid and pga.attnum = ANY(pgi.indkey)
            where
                pgix.schemaname = 'public' and pgix.tablename = c.table_name and pga.attname = c.column_name and pgi.indisunique = true
        )) as is_unique

        from information_schema.columns as c
        inner join pg_namespace as pgn on pgn.nspname = c.udt_schema
        left join pg_type pgt on c.data_type = 'USER-DEFINED' and pgn.oid = pgt.typnamespace and c.udt_name = pgt.typname
        left join information_schema.element_types e
            on ((c.table_catalog, c.table_schema, c.table_name, 'TABLE', c.dtd_identifier)
            = (e.object_catalog, e.object_schema, e.object_name, e.object_type, e.collection_type_identifier)),
        lateral (select
                (
                    case when pgt.typtype = 'e'
                    then
                    (
                        select 'enum.' || c.udt_name || '(''' || string_agg(labels.label, ''',''') || ''')'
                        from (
                            select pg_enum.enumlabel as label
                            from pg_enum
                            where pg_enum.enumtypid =
                            (
                                select typelem
                                from pg_type
                                where pg_type.typtype = 'b' and pg_type.typname = ('_' || c.udt_name)
                                limit 1
                            )
                            order by pg_enum.enumsortorder
                        ) as labels
                    )
                    else c.data_type
                    end
                ) as column_type
            ) ct
        where c.table_name = 'testch' and c.table_schema = 'public'

the changes vs the existing one are: Use lateral to be able to re-use column type Add a 3rd column merging udt_name and character_maximum_length when it's available.

aarondl commented 5 years ago

I prefer not adding additional fields in the structs here. It's already to our detriment that mysql/sqlite/psql use different subsets of the fields available, let's not make it worse by adding more fields.

udt_name is used in certain cases. You'll have to be sure this won't overwrite those use cases.

vincentserpoul commented 5 years ago

So the only solution that is potentially acceptable is re-using the full_db_type column for psql to set it with (same as above):

        (
            case when c.character_maximum_length != 0
            then
            (
                column_type || '(' || c.character_maximum_length || ')'
            )
            else ''
            end
        ) as column_full_type,

Would that work?

aarondl commented 5 years ago

Yeah, that seems fine. I like the consistency it brings, even if it is to the lowest common denominator (thanks for nothing mysql).

vincentserpoul commented 5 years ago

great! I'll start work asap

volatiletech / sqlboiler

full_db_type in postgres #494