markporoshin / dbt-greenplum

Adaptation postgres adapter for Greenplum
32 stars 20 forks source link

Create row-oriented Model with no Compression #12

Closed Sonofendor closed 1 year ago

Sonofendor commented 1 year ago

Hello

So I'm trying to create a table with row orientation and no compression. That table also should be not appendonly. My .sql file has following configuration:

{{
    config(
        orientation='row',
        materialized = 'table',
        unique_key = ['id', 'ts'],
        appendonly=false
    )
}}

select
    floor(random()*(1000))+10 as id,
    current_timestamp as ts

model compiles into following script:

create  table "aplus_core_data"."data_vault_dev"."test_datamart"
      with (
            appendonly=false,
            blocksize=32768,
            orientation=row,
            compresstype=ZSTD,
            compresslevel=4
      )
      as (

select
    floor(random()*(1000))+10 as id,
    current_timestamp as ts
      )

      DISTRIBUTED RANDOMLY

and I get error invalid option "blocksize" for base relation, HINT: "blocksize" is only valid for Append Only relations, create an AO relation to use "blocksize". When I try to set blocksize=None orblocksize=null I get error invalid value for integer option "blocksize": none.

To create not AO table in Greenplum you should not specify blocksize parameter at all. But adapter seems to be using default value for blocksize all the time and specifing it in CREATE TABLE statement.

I also tried to specify compresstype='NONE' and compresslevel=0 to disable compression on table but I have no idea if that works since I can't get past errors mentioned above. But I suspect that those parameters would have same issue as blocksize: I don't need to specify them at all but adapter would specify them with default values of with values from config anyway.

So my question is: is it possible not AO tables with dbt-greenplum? Or is it suitable only for creating compresssed AO column-oriented tables?

markporoshin commented 1 year ago

Hi! I'm going to check this problem on this weekend. I assume adapter allow all possible configurations of table, if don't it is a bug

markporoshin commented 1 year ago

As I can see, you can't create row-oriented table for appendonly=false table. Probably you want to create heap based table?

markporoshin commented 1 year ago
{{
    config(
        orientation='row',
        materialized='table',
        compresstype='NONE',
        appendonly=true,
        schema='test',
    )
}}

works for me

Sonofendor commented 1 year ago

Thanks for your answer!

As far as I understand from this page heap storage is the opposite to appendonly. So it is possible with dbt connector to create row-oriented and also uncompressed tables but heap tables are impossible to create.

The problem with AO tables is that they can (and they do in our case) take up space on disk that is not utiliazied. We handle this by monitoring share of hidden tuples and performing VACUUM from time to time. With incremental models we also sometimes create tables "manually" so dbt generates only INSERT statements. However it would be great to be able to create not AO tables wih dbt.

I will create new issue related to AO specifically since title of this one is "Create row-oriented Model with no Compression" and config you provided works fine for that. Also it would be good to mention in readme that creating not AO tables is not possible with current version of dbt-greenplum