Redshift is a Cloud Data Warehouse. It is a managed service, with a multi-node MPP (Massively Parallel Processing) architecture, and capable of scaling to many nodes.
Comparable Cloud Data Warehouses are SnowFlake, and Azure SQL Data Warehouse.
You can use pip to install cloudmesh-redshift.
$ pip install cloudmesh-cmd5
$ pip install cloudmesh-sys
$ pip install cloudmesh-cloud
$ pip install cloudmesh-redshift
$ git clone https://github.com/cloudmesh/cloudmesh-redshift.git
$ cd cloudmesh-redshift
$ pip install -e .
The external libraries needed are
To create an AWS account refer to this page from the manual
To configure Cloudmesh for the AWS account, here are the keys to edit. Configuration via "~/.cloudmesh/cloudmesh.yaml"
MongoDB is used to store the collections that allow for status passing, results and so on. So, MongoDB will be needed to be started.
$ cms admin mongo start
Here is how to install MongoDB using the cloudmesh shell.
$ cms admin mongo install
This set of interfaces allows you to use AWS Redshift in both admin mode and regular user mode.
Admin usage allows you to administer the AWS Redshift cluster
A single-node or a multi-node cluster may be created.
Networking changes (like allowing port access) are done, to enable access to the AWS Redshift cluster from external programming tools like Python.
Change the number of nodes in the cluster.
Alter the sizes of the individual nodes to other sizes - dc2.large, ds2.xlarge, dc2.8xlarge
Change the cluster id for the cluster.
Change the master password of the cluster
These are power users who can usage allows for usage of the AWS Redshift cluster.
DDL Statements (like CREATE TABLE, ALTER TABLE, DROP TABLE) can be run from a file.
DML Statements (like INSERT, UPDATE, DELETE) can be run from a file to insert or alter data in tables in the database.
User level usage allows for usage of the AWS Redshift cluster.
SELECT queries to retrieve data from the cluster database can be run from the command line to retrieve data.
cms redshift runquery db awsuser AWSPass321 cl3.ced9iqbk50ks.us-west-2.redshift.amazonaws.com 5439 --querytext='"select empname from emp where empid=20;"'
Here is an example of a single-node cluster
cms redshift create my-cl1 db1 awsuser AWSPassword321 --nodetype=dc2.large --type=single-node
Similarly, for multi-node
cms redshift create my-cl2 db1 awsuser AWSPassword321 --nodetype=dc2.large --type=multi-node --nodes=3
To view all existing clusters
cms redshift describe
To view details of a specific cluster
cms redshift describe my-cl3
cms redshift modify cl11 --newpass MyPassword321
cms redshift modify cl11 --newid cl12
cms redshift resize my-cl11 --type='multi-node' --nodes=4
cms redshift resize my-cl21 --nodetype='ds2.xlarge' --nodes=2
cms redshift delete my-cl3
cms redshift runddl db1 awsuser AWSPass321 cl6.ced9iqbk50ks.us-west-2.redshift.amazonaws.com 5439 --ddlfile=./redshiftddlfile.sql
cms redshift rundml db1 awsuser AWSPass321 cl6.ced9iqbk50ks.us-west-2.redshift.amazonaws.com 5439 --dmlfile=./redshiftdmlfile.sql
cms redshift demoschema db awsuser AWSPass321 cl3.xxxxxx.us-west-2.redshift.amazonaws.com 5439 --createschema
To query the demo schema table EMP,
cms redshift runquery db awsuser AWSPass321 cl3.xxxxxx.us-west-2.redshift.amazonaws.com 5439 --empcount
cms redshift demoschema db awsuser AWSPass321 cl3.xxxxxx.us-west-2.redshift.amazonaws.com 5439 --deleteschema
Here is an example of a single-node cluster
curl -X POST "http://localhost:8080/api/redshift/v1/cluster/cl123?dbName=db1&masterUserName=awsuser1&passWord=AWSPassWord1&nodeType=dc2.large&clusterType=single-node" -H "accept: application/json"
Similarly, for multi-node
curl -X POST "http://localhost:8080/api/redshift/v1/cluster/cl123?dbName=db1&masterUserName=awsuser1&passWord=AWSPassWord1&nodeType=dc2.large&clusterType=multi-node&nodeCount=2" -H "accept: application/json"
To view all existing clusters
curl -X GET "http://localhost:8080/api/redshift/v1/clusters" -H "accept: application/json"
To view details of a specific cluster
curl -X GET "http://localhost:8080/api/redshift/v1/cluster/123" -H "accept: application/json"
curl -X PATCH "http://localhost:8080/api/redshift/v1/cluster/cl123/changepassword?newPass=PassAWSword321" -H "accept: application/json"
curl -X PATCH "http://localhost:8080/api/redshift/v1/cluster/cl123/rename?newId=cl456" -H "accept: application/json"
curl -X PATCH "http://localhost:8080/api/redshift/v1/cluster/cl123/resize?clusterType=multi-node&nodeCount=3&nodeType=dc2.large" -H "accept: application/json"
curl -X PATCH "http://localhost:8080/api/redshift/v1/cluster/cl123/changenodetype?clusterType=multi-node&nodeType=ds2.xlarge" -H "accept: application/json"
curl -X DELETE "http://localhost:8080/api/redshift/v1/cluster/cl123" -H "accept: application/json"
curl -X PATCH "http://localhost:8080/cloudmesh/redshift/v1/cluster/cl123/runDDL?dbName=db1&host=cl8.ced9iqbk50ks.us-west-2.redshift.amazonaws.com&port=5439&userName=awsuser&passWord=AWSPass123&sql_file_contents=Q1JFQVRFIFRBQkxFIEVNUChFTVBfSUQgSU5ULCBFTVBfTkFNRSBWQVJDSEFSKDEyMCkpOwpDUkVBVEUgVEFCTEUgREVQVCAoREVQVF9JRCBJTlQsIEROQU1FIFZBUkNIQVIoODApKTsKQ1JFQVRFIFRBQkxFIEFTU0lHTiAoRU1QSUQgSU5ULCBERVBUX0lEIElOVCk7Cg%3D%3D"
curl -X PATCH "http://localhost:8080/cloudmesh/redshift/v1/cluster/cl123/runDML?dbName=db1&host=cl8.ced9iqbk50ks.us-west-2.redshift.amazonaws.com&port=5439&userName=awsuser&passWord=AWSPass123&sql_file_contents=SU5TRVJUIElOVE8gRU1QIFZBTFVFUyAoMTAsICdzbWl0aCcpOwpJTlNFUlQgSU5UTyBFTVAgVkFMVUVTICgyMCwgJ2pvbmVzJyk7CklOU0VSVCBJTlRPIEVNUCBWQUxVRVMgKDMwLCAnc2NvdHQnKTsKCg%3D%3D"
To query ,
curl -X PATCH "http://localhost:8080/cloudmesh/redshift/v1/cluster/cl8/runQuery?dbName=db1&host=cl8.ced9iqbk50ks.us-west-2.redshift.amazonaws.com&port=5439&userName=awsuser&passWord=AWSPass123&queryText=SELECT%20count(*)%20from%20EMP%3B" -H "accept: application/json"
As mentioned above, there are typically 3 levels to the interface
Security for these APIs can be granted accordingly.
For most of the Cluster operations and DDL (Schema operations), it is suggested that the swagger API would have some "admin" level security.
For example
security:
- OAuth2: [admin] # Use OAuth with a different scope
responses:
'200':
description: OK
'401':
description: Not authenticated
'403':
description: Access token does not have the required scope
Each admin command in redshift (create, modify, delete) takes upwards of 2 minutes to execute. Hence the tests are NOT listed as test, but instead of t1. It is recommended to run pytest test methods individually, and not one after another. The waiter method of AWS RedShift was attempted, but its behavior was not found to be predictable.
+------------------+--------------------------------------------------------------------------------------------------+
| Machine Arribute | Time/s |
+------------------+--------------------------------------------------------------------------------------------------+
| mac_version | 10.14.4 |
| machine | ('x86_64',) |
| node | ('USL012826',) |
| platform | Darwin-18.5.0-x86_64-i386-64bit |
| processor | ('i386',) |
| processors | Darwin |
| python | 3.7.2 (default, Apr 11 2019, 21:32:35) |
| | [Clang 10.0.1 (clang-1001.0.46.3)] |
| release | ('18.5.0',) |
| sys | darwin |
| system | Darwin |
| user | shirish_joshi |
| version | Darwin Kernel Version 18.5.0: Mon Mar 11 20:40:32 PDT 2019; root:xnu-4903.251.3~3/RELEASE_X86_64 |
| win_version | |
+------------------+--------------------------------------------------------------------------------------------------+
+------------------------------------------+------+----------------+--------+-------------+-------------+
| timer | time | node | system | mac_version | win_version |
+------------------------------------------+------+----------------+--------+-------------+-------------+
| cms redshift create my-cl1 db1 awsuser | 3.72 | ('USL012826',) | Darwin | 10.14.4 | |
| AWSPass321 --nodetype=dc2.large | | | | | |
| --type=single-node | | | | | |
+------------------------------------------+------+----------------+--------+-------------+-------------+
+------------------+--------------------------------------------------------------------------------------------------+
| Machine Arribute | Time/s |
+------------------+--------------------------------------------------------------------------------------------------+
| mac_version | 10.14.4 |
| machine | ('x86_64',) |
| node | ('USL012826',) |
| platform | Darwin-18.5.0-x86_64-i386-64bit |
| processor | ('i386',) |
| processors | Darwin |
| python | 3.7.2 (default, Apr 11 2019, 21:32:35) |
| | [Clang 10.0.1 (clang-1001.0.46.3)] |
| release | ('18.5.0',) |
| sys | darwin |
| system | Darwin |
| user | shirish_joshi |
| version | Darwin Kernel Version 18.5.0: Mon Mar 11 20:40:32 PDT 2019; root:xnu-4903.251.3~3/RELEASE_X86_64 |
| win_version | |
+------------------+--------------------------------------------------------------------------------------------------+
+------------------------------------------+------+----------------+--------+-------------+-------------+
| timer | time | node | system | mac_version | win_version |
+------------------------------------------+------+----------------+--------+-------------+-------------+
| cms redshift create my-cl2 db1 awsuser | 3.78 | ('USL012826',) | Darwin | 10.14.4 | |
| AWSPass321 --nodetype=dc2.large | | | | | |
| --type=multi-node --nodes=2 | | | | | |
+------------------------------------------+------+----------------+--------+-------------+-------------+
+------------------+--------------------------------------------------------------------------------------------------+
| Machine Arribute | Time/s |
+------------------+--------------------------------------------------------------------------------------------------+
| mac_version | 10.14.4 |
| machine | ('x86_64',) |
| node | ('USL012826',) |
| platform | Darwin-18.5.0-x86_64-i386-64bit |
| processor | ('i386',) |
| processors | Darwin |
| python | 3.7.2 (default, Apr 11 2019, 21:32:35) |
| | [Clang 10.0.1 (clang-1001.0.46.3)] |
| release | ('18.5.0',) |
| sys | darwin |
| system | Darwin |
| user | shirish_joshi |
| version | Darwin Kernel Version 18.5.0: Mon Mar 11 20:40:32 PDT 2019; root:xnu-4903.251.3~3/RELEASE_X86_64 |
| win_version | |
+------------------+--------------------------------------------------------------------------------------------------+
+-------------------+------+----------------+--------+-------------+-------------+
| timer | time | node | system | mac_version | win_version |
+-------------------+------+----------------+--------+-------------+-------------+
| describe clusters | 1.08 | ('USL012826',) | Darwin | 10.14.4 | |
+-------------------+------+----------------+--------+-------------+-------------+
+------------------+--------------------------------------------------------------------------------------------------+
| Machine Arribute | Time/s |
+------------------+--------------------------------------------------------------------------------------------------+
| mac_version | 10.14.4 |
| machine | ('x86_64',) |
| node | ('USL012826',) |
| platform | Darwin-18.5.0-x86_64-i386-64bit |
| processor | ('i386',) |
| processors | Darwin |
| python | 3.7.2 (default, Apr 11 2019, 21:32:35) |
| | [Clang 10.0.1 (clang-1001.0.46.3)] |
| release | ('18.5.0',) |
| sys | darwin |
| system | Darwin |
| user | shirish_joshi |
| version | Darwin Kernel Version 18.5.0: Mon Mar 11 20:40:32 PDT 2019; root:xnu-4903.251.3~3/RELEASE_X86_64 |
| win_version | |
+------------------+--------------------------------------------------------------------------------------------------+
+---------------------------+------+----------------+--------+-------------+-------------+
| timer | time | node | system | mac_version | win_version |
+---------------------------+------+----------------+--------+-------------+-------------+
| cms redshift describe cl9 | 3.22 | ('USL012826',) | Darwin | 10.14.4 | |
+---------------------------+------+----------------+--------+-------------+-------------+
+------------------+--------------------------------------------------------------------------------------------------+
| Machine Arribute | Time/s |
+------------------+--------------------------------------------------------------------------------------------------+
| mac_version | 10.14.4 |
| machine | ('x86_64',) |
| node | ('USL012826',) |
| platform | Darwin-18.5.0-x86_64-i386-64bit |
| processor | ('i386',) |
| processors | Darwin |
| python | 3.7.2 (default, Apr 11 2019, 21:32:35) |
| | [Clang 10.0.1 (clang-1001.0.46.3)] |
| release | ('18.5.0',) |
| sys | darwin |
| system | Darwin |
| user | shirish_joshi |
| version | Darwin Kernel Version 18.5.0: Mon Mar 11 20:40:32 PDT 2019; root:xnu-4903.251.3~3/RELEASE_X86_64 |
| win_version | |
+------------------+--------------------------------------------------------------------------------------------------+
+------------------------------+------+----------------+--------+-------------+-------------+
| timer | time | node | system | mac_version | win_version |
+------------------------------+------+----------------+--------+-------------+-------------+
| cms redshift allowaccess cl8 | 4.83 | ('USL012826',) | Darwin | 10.14.4 | |
+------------------------------+------+----------------+--------+-------------+-------------+
+------------------+--------------------------------------------------------------------------------------------------+
| Machine Arribute | Time/s |
+------------------+--------------------------------------------------------------------------------------------------+
| mac_version | 10.14.4 |
| machine | ('x86_64',) |
| node | ('USL012826',) |
| platform | Darwin-18.5.0-x86_64-i386-64bit |
| processor | ('i386',) |
| processors | Darwin |
| python | 3.7.2 (default, Apr 11 2019, 21:32:35) |
| | [Clang 10.0.1 (clang-1001.0.46.3)] |
| release | ('18.5.0',) |
| sys | darwin |
| system | Darwin |
| user | shirish_joshi |
| version | Darwin Kernel Version 18.5.0: Mon Mar 11 20:40:32 PDT 2019; root:xnu-4903.251.3~3/RELEASE_X86_64 |
| win_version | |
+------------------+--------------------------------------------------------------------------------------------------+
+------------------------------+------+----------------+--------+-------------+-------------+
| timer | time | node | system | mac_version | win_version |
+------------------------------+------+----------------+--------+-------------+-------------+
| resize cluster to multi node | 1.38 | ('USL012826',) | Darwin | 10.14.4 | |
+------------------------------+------+----------------+--------+-------------+-------------+
+------------------+--------------------------------------------------------------------------------------------------+
| Machine Arribute | Time/s |
+------------------+--------------------------------------------------------------------------------------------------+
| mac_version | 10.14.4 |
| machine | ('x86_64',) |
| node | ('USL012826',) |
| platform | Darwin-18.5.0-x86_64-i386-64bit |
| processor | ('i386',) |
| processors | Darwin |
| python | 3.7.2 (default, Apr 11 2019, 21:32:35) |
| | [Clang 10.0.1 (clang-1001.0.46.3)] |
| release | ('18.5.0',) |
| sys | darwin |
| system | Darwin |
| user | shirish_joshi |
| version | Darwin Kernel Version 18.5.0: Mon Mar 11 20:40:32 PDT 2019; root:xnu-4903.251.3~3/RELEASE_X86_64 |
| win_version | |
+------------------+--------------------------------------------------------------------------------------------------+
+------------------------------------------+------+----------------+--------+-------------+-------------+
| timer | time | node | system | mac_version | win_version |
+------------------------------------------+------+----------------+--------+-------------+-------------+
| cms redshift modify cl9 --newpass | 3.73 | ('USL012826',) | Darwin | 10.14.4 | |
| MyNewPass321 | | | | | |
+------------------------------------------+------+----------------+--------+-------------+-------------+
+------------------+--------------------------------------------------------------------------------------------------+
| Machine Arribute | Time/s |
+------------------+--------------------------------------------------------------------------------------------------+
| mac_version | 10.14.4 |
| machine | ('x86_64',) |
| node | ('USL012826',) |
| platform | Darwin-18.5.0-x86_64-i386-64bit |
| processor | ('i386',) |
| processors | Darwin |
| python | 3.7.2 (default, Apr 11 2019, 21:32:35) |
| | [Clang 10.0.1 (clang-1001.0.46.3)] |
| release | ('18.5.0',) |
| sys | darwin |
| system | Darwin |
| user | shirish_joshi |
| version | Darwin Kernel Version 18.5.0: Mon Mar 11 20:40:32 PDT 2019; root:xnu-4903.251.3~3/RELEASE_X86_64 |
| win_version | |
+------------------+--------------------------------------------------------------------------------------------------+
+-------------------------------------+------+----------------+--------+-------------+-------------+
| timer | time | node | system | mac_version | win_version |
+-------------------------------------+------+----------------+--------+-------------+-------------+
| cms redshift modify cl8 --newid cl9 | 3.34 | ('USL012826',) | Darwin | 10.14.4 | |
+-------------------------------------+------+----------------+--------+-------------+-------------+
+------------------+--------------------------------------------------------------------------------------------------+
| Machine Arribute | Time/s |
+------------------+--------------------------------------------------------------------------------------------------+
| mac_version | 10.14.4 |
| machine | ('x86_64',) |
| node | ('USL012826',) |
| platform | Darwin-18.5.0-x86_64-i386-64bit |
| processor | ('i386',) |
| processors | Darwin |
| python | 3.7.2 (default, Apr 11 2019, 21:32:35) |
| | [Clang 10.0.1 (clang-1001.0.46.3)] |
| release | ('18.5.0',) |
| sys | darwin |
| system | Darwin |
| user | shirish_joshi |
| version | Darwin Kernel Version 18.5.0: Mon Mar 11 20:40:32 PDT 2019; root:xnu-4903.251.3~3/RELEASE_X86_64 |
| win_version | |
+------------------+--------------------------------------------------------------------------------------------------+
+------------------------------------------+------+----------------+--------+-------------+-------------+
| timer | time | node | system | mac_version | win_version |
+------------------------------------------+------+----------------+--------+-------------+-------------+
| cms redshift runddl db1 awsuser | 4.21 | ('USL012826',) | Darwin | 10.14.4 | |
| AWSPass321 cl8.ced9iqbk50ks.us- | | | | | |
| west-2.redshift.amazonaws.com 5439 | | | | | |
| --ddlfile=./redshiftddlfile.sql | | | | | |
+------------------------------------------+------+----------------+--------+-------------+-------------+
+------------------+--------------------------------------------------------------------------------------------------+
| Machine Arribute | Time/s |
+------------------+--------------------------------------------------------------------------------------------------+
| mac_version | 10.14.4 |
| machine | ('x86_64',) |
| node | ('USL012826',) |
| platform | Darwin-18.5.0-x86_64-i386-64bit |
| processor | ('i386',) |
| processors | Darwin |
| python | 3.7.2 (default, Apr 11 2019, 21:32:35) |
| | [Clang 10.0.1 (clang-1001.0.46.3)] |
| release | ('18.5.0',) |
| sys | darwin |
| system | Darwin |
| user | shirish_joshi |
| version | Darwin Kernel Version 18.5.0: Mon Mar 11 20:40:32 PDT 2019; root:xnu-4903.251.3~3/RELEASE_X86_64 |
| win_version | |
+------------------+--------------------------------------------------------------------------------------------------+
+------------------------------------------+------+----------------+--------+-------------+-------------+
| timer | time | node | system | mac_version | win_version |
+------------------------------------------+------+----------------+--------+-------------+-------------+
| cms redshift rundml db1 awsuser | 3.19 | ('USL012826',) | Darwin | 10.14.4 | |
| AWSPass321 cl8.ced9iqbk50ks.us- | | | | | |
| west-2.redshift.amazonaws.com 5439 | | | | | |
| --dmlfile=./redshiftdmlfile.sql | | | | | |
+------------------------------------------+------+----------------+--------+-------------+-------------+
+------------------+--------------------------------------------------------------------------------------------------+
| Machine Arribute | Time/s |
+------------------+--------------------------------------------------------------------------------------------------+
| mac_version | 10.14.4 |
| machine | ('x86_64',) |
| node | ('USL012826',) |
| platform | Darwin-18.5.0-x86_64-i386-64bit |
| processor | ('i386',) |
| processors | Darwin |
| python | 3.7.2 (default, Apr 11 2019, 21:32:35) |
| | [Clang 10.0.1 (clang-1001.0.46.3)] |
| release | ('18.5.0',) |
| sys | darwin |
| system | Darwin |
| user | shirish_joshi |
| version | Darwin Kernel Version 18.5.0: Mon Mar 11 20:40:32 PDT 2019; root:xnu-4903.251.3~3/RELEASE_X86_64 |
| win_version | |
+------------------+--------------------------------------------------------------------------------------------------+
+------------------------------------------+------+----------------+--------+-------------+-------------+
| timer | time | node | system | mac_version | win_version |
+------------------------------------------+------+----------------+--------+-------------+-------------+
| cms redshift runquery db1 awsuser | 0.04 | ('USL012826',) | Darwin | 10.14.4 | |
| AWSPass321 cl8.ced9iqbk50ks.us- | | | | | |
| west-2.redshift.amazonaws.com 5439 | | | | | |
| --querytext='"select empname from emp whe| | | | | |
| re empid=20;"' | | | | | |
+------------------------------------------+------+----------------+--------+-------------+-------------+
Note: I wasn't able to capture these, as the swagger server got generated with "do some magic".