transferwise / pipelinewise-target-redshift

Singer.io Target for Amazon Redshift - PipelineWise compatible
https://transferwise.github.io/pipelinewise/
Other
12 stars 65 forks source link

Support reserved words as table and column names #49

Closed koszti closed 4 years ago

koszti commented 4 years ago

Problem

Sometimes the stream/table name is a reserved word in Redshift and the current implementation doesn't support that so we end up getting a sql error. This PR is the equivalent of https://github.com/transferwise/pipelinewise-target-snowflake/pull/65 but for Amazon Redshift.

This PR also sync up with target-snowflake. The two codebase as very similar and the common things potentially will be moved into a shared module in the future.

Solution

Wrap the table name in double quotes to make it safe to be in a Redshift sql query.

Example

The streams with one schema where the table is order:

{"type": "STATE", "value": {"currently_syncing": "my_db-order"}}
{"type": "SCHEMA", "stream": "my_db-order", "schema": {"properties": {"data": {"inclusion": "available", "format": "binary", "type": ["null", "string"]}, "id": {"inclusion": "automatic", "format": "binary", "type": ["null", "string"]}, "created_at": {"inclusion": "available", "format": "date-time", "type": ["null", "string"]}}, "type": "object"}, "key_properties": ["id"]}
{"type": "ACTIVATE_VERSION", "stream": "my_db-order", "version": 1576670613163}
{"type": "RECORD", "stream": "my_db-order", "record": {"data": "6461746132", "id": "706b32", "created_at": "2019-12-17T16:02:55+00:00"}, "version": 1576670613163, "time_extracted": "2019-12-18T12:03:33.174343Z"}
{"type": "RECORD", "stream": "my_db-order", "record": {"data": "64617461313030", "id": "706b33", "created_at": "2019-12-18T11:46:38+00:00"}, "version": 1576670613163, "time_extracted": "2019-12-18T12:03:33.174343Z"}
{"type": "RECORD", "stream": "my_db-order", "record": {"data": "6461746134", "id": "706b34", "created_at": "2019-12-17T16:32:22+00:00"}, "version": 1576670613163, "time_extracted": "2019-12-18T12:03:33.174343Z"}
{"type": "STATE", "value": {"currently_syncing": "my_db-order", "bookmarks": {"my_db-order": {"version": 1576670613163}}}}
{"type": "ACTIVATE_VERSION", "stream": "my_db-order", "version": 1576670613163}
{"type": "STATE", "value": {"currently_syncing": null, "bookmarks": {"my_db-order": {"version": 1576670613163, "log_file": "mysql-bin.000004", "log_pos": 945}}}}
{"type": "STATE", "value": {"currently_syncing": null, "bookmarks": {"my_db-order": {"version": 1576670613163, "log_file": "mysql-bin.000004", "log_pos": 945}}}}
{"type": "SCHEMA", "stream": "my_db-order", "schema": {"properties": {"data": {"inclusion": "available", "format": "binary", "type": ["null", "string"]}, "created_at": {"inclusion": "available", "format": "date-time", "type": ["null", "string"]}, "id": {"inclusion": "automatic", "format": "binary", "type": ["null", "string"]}}, "type": "object"}, "key_properties": ["id"]}
{"type": "RECORD", "stream": "my_db-order", "record": {"id": "706b35", "data": "6461746135", "created_at": "2019-12-18T13:19:20+00:00"}, "version": 1576670613163, "time_extracted": "2019-12-18T13:24:31.441849Z"}
{"type": "RECORD", "stream": "my_db-order", "record": {"id": "706b35", "data": "64617461313030", "created_at": "2019-12-18T13:19:35+00:00"}, "version": 1576670613163, "time_extracted": "2019-12-18T13:24:31.441849Z"}
{"type": "RECORD", "stream": "my_db-order", "record": {"id": "706b33", "data": "64617461313030", "created_at": "2019-12-18T11:46:38+00:00", "_sdc_deleted_at": "2019-12-18T13:19:44+00:00+00:00"}, "version": 1576670613163, "time_extracted": "2019-12-18T13:24:31.441849Z"}
{"type": "RECORD", "stream": "my_db-order", "record": {"id": "706b35", "data": "64617461313030", "created_at": "2019-12-18T13:19:35+00:00", "_sdc_deleted_at": "2019-12-18T13:19:44+00:00+00:00"}, "version": 1576670613163, "time_extracted": "2019-12-18T13:24:31.441849Z"}
{"type": "STATE", "value": {"currently_syncing": null, "bookmarks": {"my_db-order": {"version": 1576670613163, "log_file": "mysql-bin.000004", "log_pos": 1867}}}}

Table in Redshift with a reserved word: order:

Screenshot 2020-04-16 at 18 42 23

Column in Redshift with a reserved word: new:

Screenshot 2020-04-16 at 18 42 43

The query select * from local_dev_1."order" outputs:

Screenshot 2020-04-16 at 18 43 00

Tests

Added a new e2e test to test this scenario