getindata / dbt-flink-adapter

Adapter for dbt that executes dbt pipelines on Apache Flink
Apache License 2.0
80 stars 10 forks source link

Add support for computed and metadata columns #31

Closed gliter closed 1 year ago

gliter commented 1 year ago

Flink allows providing computed and metadata column during source table creation https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/create/#columns

Ideal solution would be to provide them under columns config like:

- name: clickstream
        config:
          type: streaming
          connector_properties:
            ...
          watermark:
            column: event_timestamp
            strategy: event_timestamp
        columns:
          - name: event_timestamp
            data_type: TIMESTAMP(3)
          - name: cost
            computed_as: price * quanitity
          - name: record_time
            data_type: TIMESTAMP_LTZ(3)
            metedata_from: timestamp
            ...

Which would resolve to:

create table clickstream (
  event_timestamp TIMESTAMP(3),
  cost AS price * quanitity,
  record_time METADATA FROM timestamp
)

Alternatively let's put it under config, my initial idea is:

- name: clickstream
        config:
          type: streaming
          connector_properties:
            ...
          watermark:
            column: event_timestamp
            strategy: event_timestamp
          computed_columns: # I am happy to discuss this name
            - name: cost
              computed_as: price * quanitity
            - name: record_time
              data_type: TIMESTAMP_LTZ(3)
              metedata_from: timestamp
        columns:
          - name: event_timestamp
            data_type: TIMESTAMP(3)
            ...
zqWu commented 1 year ago

perfer solution 1, under columns for metadata column, from expression is optional