Updates Pipfile.lock file s3_to_redshift/Pipfile.lock with "pipenv install" , which locks the exact versions of dependencies, ensuring reproducible builds with Python 3.8.
Testing intructions
For all scripts to be tested, connect to microservicessm ec2 node and ensure that you are on the testing branch **'GDXDSD-6711_python_3.8_s3_to_redshift_testbranch'** for all scripts. Testing branch has the necessary testing setup.
Run the script with Python v3.8 and confirm that the objects are ingested from S3 to Redshift
/home/microservice/.local/bin/pipenv run python3.8 s3_to_redshift.py config.d/GDXDSD-6711_test.microservice_iam_key_rotation.json
$ /home/microservice/.local/bin/pipenv run python3.8 s3_to_redshift.py config.d/GDXDSD-6711_test.microservice_iam_key_rotation.json
Report: s3_to_redshift.py
Config: config.d/GDXDSD-6711_test.microservice_iam_key_rotation.json
Microservice started at: 2024-07-19 16:12:41-0700 (PDT), ended at: 2024-07-19 16:13:01-0700 (PDT), elapsing: 0:00:19.800622.
Objects to process: 1
Objects successfully processed: 1
Objects that failed to process: 0
Objects output to 'processed/good': 1
Objects output to 'processed/bad': 0
Objects loaded to Redshift: 1
Empty Objects: 0
List of objects successfully fully ingested from S3, processed, loaded to S3 ('good'), and copied to Redshift:
1: client/test_microservice-iam-key/GDXDSD-6711-manual-v1.csv
Check the table value for the verification_phrase ' phrase_20240719
microservice_rsSELECT * FROM test.microservice_iam_key_rotation LIMIT 10;\q
$ microservice_rs
SET
psql (9.2.24, server 8.0.2)
WARNING: psql version 9.2, server version 8.0.
Some psql features might not work.
SSL connection (cipher: ECDHE-RSA-AES256-GCM-SHA384, bits: 256)
Type "help" for help.
snowplow=> SELECT * FROM test.microservice_iam_key_rotation LIMIT 10;
verification_phrase
---------------------
phrase_20240719
(1 row)
snowplow=> \q
$ microservice_rs < ddl/GDXDSD-6711-test.gdxdsd_6711_groups.sql
SET
CREATE TABLE
GRANT
$ microservice_rs < ddl/GDXDSD-6711-test.gdxdsd_6711_user_access.sql
SET
CREATE TABLE
GRANT
$ microservice_rs < ddl/GDXDSD-6711-test.gdxdsd_6711_user_activity.sql
SET
CREATE TABLE
GRANT
$ microservice_rs < ddl/GDXDSD-6711-test.gdxdsd_6711_user_groups.sql
SET
CREATE TABLE
GRANT
Run the script with Python v3.8 and confirm that the objects are ingested from S3 to Redshift
/home/microservice/.local/bin/pipenv run python3.8 cmslite_user_data_to_redshift.py config.d/GDXDSD-6711_cmslite_user_group_metadata.json
Report cmslite_user_data_to_redshift.py
$ /home/microservice/.local/bin/pipenv run python3.8 cmslite_user_data_to_redshift.py config.d/GDXDSD-6711_cmslite_user_group_metadata.json
Report cmslite_user_data_to_redshift.py:
Config: config.d/GDXDSD-6711_cmslite_user_group_metadata.json
Microservice started at: 2024-07-23 09:20:57-0700 (PDT), ended at: 2024-07-23 09:22:51-0700 (PDT), elapsing: 0:01:54.406056.
Objects to process: 1
Objects successfully processed: 1
Objects that failed to process: 0
Objects output to 'processed/good': 1
Objects output to 'processed/bad': 0
Tables loaded to Redshift: 4/4
List of objects successfully fully ingested from S3, processed, loaded to S3 ('good'), and copied to Redshift:
client/oz_test/GDXDSD-6711/cmslite_gdx/cms-analytics-csv.2024-05-01.tgz
List of tables that were successfully loaded into Redshift:
test.gdxdsd_6711_cms_group
test.gdxdsd_6711_user_activity
test.gdxdsd_6711_user_group
test.gdxdsd_6711_user_status
$ microservice_rs < ddl/GDXDSD-6711-test.derived.asset_downloads_derived.sql
SET
CREATE TABLE
GRANT
Run the script with Python v3.8 and confirm that the objects are ingested from S3 to Redshift
/home/microservice/.local/bin/pipenv run python3.8 build_derived_assets.py config.d/GDXDSD-6711_intranet_assets.json
$ /home/microservice/.local/bin/pipenv run python3.8 build_derived_assets.py config.d/GDXDSD-6711_intranet_assets.json
Report build_derived_assets.py:
Config: config.d/GDXDSD-6711_intranet_assets.json
Microservice started at: 2024-07-23 11:19:05-0700 (PDT), ended at: 2024-07-23 11:19:32-0700 (PDT), elapsing: 0:00:27.133444.
Objects to process: 1
Objects that failed to process: 0
Objects loaded to Redshift: 1
List of tables successfully parsed, and copied to the derived table in Redshift:
test.gdxdsd_6711_asset_downloads
$ microservice_rs
SET
psql (9.2.24, server 8.0.2)
WARNING: psql version 9.2, server version 8.0.
Some psql features might not work.
SSL connection (cipher: ECDHE-RSA-AES256-GCM-SHA384, bits: 256)
Type "help" for help.
snowplow=> TRUNCATE TABLE test.microservice_iam_key_rotation;
TRUNCATE TABLE and COMMIT TRANSACTION
snowplow=> SELECT * FROM test.microservice_iam_key_rotation LIMIT 10;
verification_phrase
---------------------
(0 rows)
cmslite_user_data_to_redshift.py
DROP TABLE IF EXISTS test.gdxdsd_6711_cms_group;DROP TABLE IF EXISTS test.gdxdsd_6711_user_activity;DROP TABLE IF EXISTS test.gdxdsd_6711_user_group;DROP TABLE IF EXISTS test.gdxdsd_6711_user_status;
snowplow=> DROP TABLE IF EXISTS test.gdxdsd_6711_cms_group;
DROP TABLE
snowplow=> DROP TABLE IF EXISTS test.gdxdsd_6711_user_activity;
DROP TABLE
snowplow=> DROP TABLE IF EXISTS test.gdxdsd_6711_user_group;
DROP TABLE
snowplow=> DROP TABLE IF EXISTS test.gdxdsd_6711_user_status;
DROP TABLE
DROP TABLE IF EXISTS test.gdxdsd_6711_asset_downloads;DROP TABLE IF EXISTS test.gdxdsd_6711_asset_downloads_derived;
snowplow=> DROP TABLE IF EXISTS test.gdxdsd_6711_asset_downloads;
DROP TABLE
snowplow=> DROP TABLE IF EXISTS test.gdxdsd_6711_asset_downloads_derived;
DROP TABLE
This PR does the following:
Testing intructions
For all scripts to be tested, connect to microservicessm ec2 node and ensure that you are on the testing branch **'GDXDSD-6711_python_3.8_s3_to_redshift_testbranch'** for all scripts. Testing branch has the necessary testing setup.
microservice_ssm
cd /home/microservice/branch/GDXDSD-6711_python_3.8_s3_to_redshift/s3_to_redshift
git switch GDXDSD-6711_python_3.8_s3_to_redshift_test_branch
s3_to_redshift.py
Run the script with Python v3.8 and confirm that the objects are ingested from S3 to Redshift
/home/microservice/.local/bin/pipenv run python3.8 s3_to_redshift.py config.d/GDXDSD-6711_test.microservice_iam_key_rotation.json
Check the table value for the verification_phrase ' phrase_20240719
microservice_rs
SELECT * FROM test.microservice_iam_key_rotation LIMIT 10;
\q
cmslite_user_data_to_redshift.py
Create test tables
microservice_rs < ddl/GDXDSD-6711-test.gdxdsd_6711_groups.sql
microservice_rs < ddl/GDXDSD-6711-test.gdxdsd_6711_user_access.sql
microservice_rs < ddl/GDXDSD-6711-test.gdxdsd_6711_user_activity.sql
microservice_rs < ddl/GDXDSD-6711-test.gdxdsd_6711_user_groups.sql
Run the script with Python v3.8 and confirm that the objects are ingested from S3 to Redshift
asset_data_to_redshift.py
Create test tables
microservice_rs <ddl/GDXDSD-6711-test.asset_downloads.sql
Run the script with Python v3.8 and confirm that the objects are ingested from S3 to Redshift
/home/microservice/.local/bin/pipenv run python3.8 asset_data_to_redshift.py config.d/GDXDSD-6711_intranet_assets.json
build_derived_assets.py
Create test tables
microservice_rs < ddl/GDXDSD-6711-test.derived.asset_downloads_derived.sql
Run the script with Python v3.8 and confirm that the objects are ingested from S3 to Redshift
/home/microservice/.local/bin/pipenv run python3.8 build_derived_assets.py config.d/GDXDSD-6711_intranet_assets.json
Cleanup
s3_to_redshift.py:
microservice_rs
TRUNCATE TABLE test.microservice_iam_key_rotation;
cmslite_user_data_to_redshift.py
DROP TABLE IF EXISTS test.gdxdsd_6711_cms_group;
DROP TABLE IF EXISTS test.gdxdsd_6711_user_activity;
DROP TABLE IF EXISTS test.gdxdsd_6711_user_group;
DROP TABLE IF EXISTS test.gdxdsd_6711_user_status;
asset_data_to_redshift.py & build_derived_assets.py
DROP TABLE IF EXISTS test.gdxdsd_6711_asset_downloads;
DROP TABLE IF EXISTS test.gdxdsd_6711_asset_downloads_derived;