jakob / Postico

Public issue tracking for Postico
https://eggerapps.at/postico/
475 stars 9 forks source link

Add support for Redshift Spectrum External Tables #382

Open andrewgross opened 7 years ago

andrewgross commented 7 years ago

Redshift recently launched Spectrum which allows querying data stored in S3 (similar to Athena, but different implementation.

When you create your external schema, it properly shows up in the sidebar, however it does not populate any table info, making it tricky to query the table properly. It would be nice if Postico could list the tables and their schemas.

To list tables: select * from svv_external_tables

This returns the following columns:

schemaname
tablename
location
input_format
output_format
serialization_lib
serde_parameters
compressed
parameters

To list table columns (and tables incidentally): select * from svv_external_columns

This returns the following columns:

schemaname
tablename
columnname
external_type
columnnum
part_key

Probably simplest to just use svv_external_columns to get the data. It would be nice if the schema showed with a different color on the sidebar to indicate that it was external as well.

jakob commented 7 years ago

I've started to work on adding support for Redshift spectrum tables. Postico can now display external tables.

Support is very basic. Column names and types are displayed in structure view, and the table view treats all columns like text columns (eg. when using filters). I still need to fix this.

I'd welcome your input! Please try the latest build from https://eggerapps.at/postico/builds/

There is a bug where the structure view incorrectly thinks that changes have been made -- I still need to fix this too.

andrewgross commented 7 years ago

Excellent, we will check it out!

andrewgross commented 7 years ago

So far this is working great, thank you very much for adding this.

andrewgross commented 7 years ago

Been using this for a while, no real issues. To reduce confusion you can eliminate a few options from the right click menus:

You can remove Truncate and the entire Maintenance menu. Additionally, the export feature can probably be removed. It would be possible to generate s3 links to the underlying data, but that seems like overkill.

jakob commented 7 years ago

Is there a reason why you’d remove the export function? Is it broken, or do you just think it’s not useful in practice?

andrewgross commented 7 years ago

Just seems incongruous with how the data is stored in s3. It can be partitioned, multiple large files, non-csv format etc. There might be some way of doing where it streams the data down from S3 to the redshift and exports it normally? Spectrum is pushed as being for huge datasets that don't well in Redshift, so exporting a whole table seems to go against that.

andrewgross commented 7 years ago

I spent some time talking with the AWS engineers and they recommended using svv_tables and svv_columns because they have a merged view of local and external tables. Unsure if this will help clean up the code but I figured I should mention it.