airbytehq / airbyte

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
https://airbyte.com
Other
15.54k stars 4k forks source link

[source-postgres] has 2-dimension arrays that are populated as 2-dimension array with null values in Snowflake #37399

Open Wamolambo opened 5 months ago

Wamolambo commented 5 months ago

Airbyte version

0.57.1

Connector name and version

source-postgres: 3.3.26 destination-snowflake: 3.7.0

What step the error happened?

The error occurs during sync from a Postgres [source] to a Snowflake [destination].

Relevant information

I have an Airbyte job that syncs a table from Postgres [source] to Snowflake [destination]. The Postgres [source] table consists of two columns. The first column (named state) consists of a 1-dimension array and the second column (named covariance) consists of a 2-dimension array.

The table is created with the following script:

create table matrix( state double precision [], covariance double precision[][] );

Records are inserted in the table using the following script:

insert into matrix (state,covariance) values (Array[0.23,0.45,0.29],Array[Array[0.98,0.67,0],Array[0.67,0.41,0.59]]);

The souce table looks as follows:

state | covariance ------------------+---------------------------------- {0.23,0.45,0.29} | {{0.98,0.67,0},{0.67,0.41,0.59}} (1 row)

On Airbyte, the Postgres source, Snowflake destination and connection objects were configured as follows: Postgres [source]

Connection

The result of the manual sync loads the 1-dimension array column correctly in the destination. However, the 2-dimension array is loaded as a 1 dimension array with NULL values as shown below.

state | covariance --------------------+-------------- [0.23, 0.45, 0.29] | [null, null] (1 row) From our observation, we noticed that the number of null values corrosponds with the number of inner arrays. For example, if you have {{1,2}, {7,8}} at the source than you will get {null, null} at the destination.

The sync history does not show any errors.

Note: This problem also occurs when syncing from Postgres [source] to Postgres [destination].

Relevant log output

No response

Contribute

marcosmarxm commented 4 months ago

Hello @Wamolambo I added the issue to the connector team backlog for further discussion in next sprint.

Wamolambo commented 4 months ago

@marcosmarxm Thank you for the response. I have added more details on the ticket so the issue can be replicated.