bryan-lott / mysqldump-to-redshift

Convert a standard mysqldump into a Redshift-importable format.
Eclipse Public License 1.0
5 stars 0 forks source link

mysqldump-to-redshift 1.0.1

A simple script to convert from a mysqldump file of a single table to an "importable" Redshift data file.

Installation

Make sure to have at least Java 1.7 on the target machine

Build the project

$ lein uberjar

Now copy the standalone jar file to the target machine.

Run the Tests

$ lein tests

These currently include tests for fix-format and extract-values. In development these were the most likely places to break by far.

Prebuilt

Copy the prebuilt jar file in target to the target machine and execute according to the usage below.

Usage

$ java -jar mysqldump-to-redshift.jar [mysql dump file] [output_file]

Please note: This will overwrite output_file.

Once the script finishes, use s3cmd to copy the file to an S3 bucket (if the file is large, it is recommended to gzip the file.

$ s3cmd put output_file s3://your-bucket

Once that's done, use a version of the following command to load it into your table.

COPY [table name] from 's3://your-bucket/output_file' credentials 'aws_access_key_id=[key];aws_secret_access_key=[secret key]' delimiter '\t' EMPTYASNULL;

Remember to add the GZIP parameter if you've gzipped the file before loading to s3.

Options

Examples

$ java -jar mysqldump-to-redshift.jar mysqldump_table.sql redshift_import.tsv

$ java -jar mysqldump-to-redshift.jar mysqldump_table.sql redshift_import.tsv && gzip redshift_import.tsv && s3cmd put redshift_import.tsv s3://your-bucket

$ java -jar mysqldump-to-redshift.jar mysqldump_table.sql redshift_import.tsv && gzip redshift_import.tsv && gzip redshift_import.tsv && s3cmd put redshift_import.tsv.gz s3://your-bucket

Warnings

Roadmap

License

Copyright © 2016 Bryan Lott

Distributed under the Eclipse Public License either version 1.0 or (at your option) any later version.