airbytehq / airbyte

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
https://airbyte.com
Other
16.04k stars 4.11k forks source link

Destination Redshift: Incremental dedupe + history mode fails when cursor field is `SUPER` #16582

Open olivermeyer opened 2 years ago

olivermeyer commented 2 years ago

Environment

Current Behavior

We recently upgraded Airbyte to 0.39.36 and had a few issues after some fields were converted to SUPER. Most issues were fixed by a full refresh, but we're left with one issue which we cannot solve: the normalization fails when using the incremental dedupe + history mode on a stream which uses a SUPER cursor field.

Specifically, the AccountHistory stream in the Salesforce uses createddate as cursor, but this field seems to be cast to SUPER somewhere along the way and I'm getting this error:

2022-09-12 07:04:08 normalization > 07:04:06.464534 [error] [MainThread]: Database Error in model salesforce__accounthistory_scd (models/generated/airbyte_incremental/scd/tgtg_raw/salesforce__accounthistory_scd.sql)
2022-09-12 07:04:08 normalization > 07:04:06.464882 [error] [MainThread]:   function lag(super) does not exist
2022-09-12 07:04:08 normalization > 07:04:06.465214 [error] [MainThread]:   HINT:  No function matches the given name and argument types. You may need to add explicit type casts.

Expected Behavior

The sync should succeed.

Steps to Reproduce

  1. Set up a Salesforce source and a Redshift destination
  2. Set up a connection between the two, selecting the AccountHistory stream in incremental dedupe + history mode
  3. Run the connection

Are you willing to submit a PR?

With support to understand the normalization code, maybe.

salima-airbyte commented 2 years ago

Connector team is appropriately tagged. Removing team/oss