HearthSim / docker-pgredshift

Redshift docker image based on postgres
https://hub.docker.com/repository/docker/hearthsim/pgredshift
MIT License
65 stars 11 forks source link
docker localstack postgres redshift

docker-pgredshift

A docker image based on Debian with PostgreSQL which simulates an AWS Redshift instance.

Why?

Amazon Redshift is close enough to, and compatible enough with Postgres that you can use a lot of Postgres tooling and queries with it transparently. But some of its features, or slight differences with Postgres, may be harder to work around.

Amazon does not make a local instance of Redshift available, nor is the project open source. This is especially annoying if you are writing tests against code which has to run queries with Redshift-specific syntax in them. Postgres will normally reject them unless you mock the features in some way.

That's what this project is. It's not meant to run in production, but it is meant to help mock Redshift's features for testing purposes.

PLEASE NOTE: As of July 2018, very little is implemented. PRs welcome.

Key differences

The ultimate goal of pgredshift is to be as close as possible to the real Redshift in terms of feature parity. However, some key differences will remain:

Features

The pgredshift image is build on top of Debian "Buster".

plpythonu

The image is built with plpythonu (Python 2.7) language support. More information: https://docs.aws.amazon.com/redshift/latest/dg/udf-python-language-support.html

The following packages are installed:

The image also includes pip, setuptools and wheel for Python 2.7.

plpython3u

The image is built with plpython3u (Python 3.6) language support. Although Redshift does not support Python 3, you may use this to help ensure compatibility of UDFs across Python 2 and 3.

The image includes pip, setuptools and wheel for Python 3.6.

Postgres extensions

SET query_group

The query_group extension adds support for the SET query_group to ... command. Postgres does not allow setting unknown variables, so including that extension prevents an error when issuing the command. Note that the value is ignored as query groups themselves are not implemented.

Reference: https://docs.aws.amazon.com/redshift/latest/dg/r_query_group.html

Additional tables

Redshift system tables are implemented in 00_stl_tables.sql and 00_stv_tables.sql. Expect them to be empty, or include garbage data, but SELECTs won't necessarily fail.

Additional functions

Additional functions are implemented as Python or SQL UDFs. For a list, see sql/01_functions.sql.

License

This project is dual-licensed under the MIT license and the PostgreSQL license. You may choose whichever license suits your purpose best. The full license texts are available in the LICENSE (MIT) and LICENSE.PostgreSQL (PostgreSQL License) files.