embulk / embulk-output-jdbc

MySQL, PostgreSQL, Redshift and generic JDBC output plugins for Embulk
Other
88 stars 86 forks source link

embulk-output-redshift is not compatible with SUPER type #332

Open hibira opened 8 months ago

hibira commented 8 months ago

embulk-output-redshift does not support SUPER type.

Currently, only VARBYTE can be used to send data to Redshift. However, VARBYTE has an upper limit of 65535 bytes, so it is not possible to migrate strings longer than that.

By supporting Redshift's SUPER type, longer strings can be migrated.

Note that the following settings can be used as a workaround, but I would like to see formal support for this.

in:
  type: file
  path_prefix: ./input.csv
  parser:
    charset: UTF-8
    newline: CRLF
    type: csv
    delimiter: ","
    quote: "'"
    escape: "'"
    null_string: "NULL"
    skip_header_lines: 1
    columns:
      - { name: col1, type: long }
      - { name: col2, type: string }
      - { name: col3, type: json }
out:
  type: redshift
  host: xxxxxx
  user: xxxxxx
  password: xxxxxx
  database: dev
  table: sample_table
  aws_access_key_id: xxxxxx
  aws_secret_access_key: xxxxxx
  iam_user_name: xxxxxx
  s3_bucket: xxxxxx
  s3_key_prefix: xxxxxx
  mode: insert
  column_options:
    col1: { type: "INTEGER" }
    col2: { type: "VARCHAR(255)" }
    col3: { type: "SUPER", value_type: json }