Splits the two microservices currently in sfts/ into the folders redshift_to_s3/ and s3_to_sfts/ while making sure that the histories of the files are kept
Modifies the configs for each microservice to remove unneeded parts that exist because the two microservices shared the config files
Edited the README files to be specific about each microservice
To review the changes:
The sfts/ folder has been renamed to redshift_to_s3
The folder s3_to_sfts/ has been created
s3_to_sfts.py has been moved to the s3_to_sfts/ folder
The README, files related to the pipfiles, and config.d/ and its contents have been copied into s3_to_sfts
The SDPR configs have been removed from s3_to_sfts as they are never uploaded there
The configs in redshift_to_s3 have had options related to SFTS and their references in its README removed
The configs in s3_to_sfts have had options related to dml, sql, dates and their references in its README removed
Testing these changes require modifications to the config files. You can see these changes by looking at the attached zip file to this ticket
Testing redshift_to_s3
Review its README to see if it makes sense or if anything needs to be changed
Log into the ec2 instance through the following commands
awsmfa prod <AWS OTP>
microservice_ssm
cd /home/microservice/branch/GDXDSD-5362-split-sfts-microservices-into-separate-folders/redshift_to_s3
Run the following commands and compare their output to what's expected. Note that the commands for pmrp_qdata_range and sdpr_historical each take minutes to run
pipenv run python redshift_to_s3.py -c config.d/pmrp_all.json
pipenv run python redshift_to_s3.py -c config.d/pmrp_date_range.json
pipenv run python redshift_to_s3.py -c config.d/pmrp_max_date.json
pipenv run python redshift_to_s3.py -c config.d/pmrp_qdata_daily.json
pipenv run python redshift_to_s3.py -c config.d/pmrp_qdata_dates.json
pipenv run python redshift_to_s3.py -c config.d/pmrp_qdata_range.json
pipenv run python redshift_to_s3.py -c config.d/sdpr_historical.json
pipenv run python redshift_to_s3.py -c config.d/sdpr_last_full_day.json
Review its README to see if it makes sense or if anything needs to be changed
Review the histories of the files in s3_to_sfts to see if their histories are maintained after they have been moved/copied
Navigate using the following command
cd /home/microservice/branch/GDXDSD-5362-split-sfts-microservices-into-separate-folders/s3_to_sfts
Run the following commands and compare their output to what's expected. Note that the command for pmrp_qdata_range take minutes to run and that there are no commands for SDPR
pipenv run python s3_to_sfts.py -c config.d/pmrp_all.json
pipenv run python s3_to_sfts.py -c config.d/pmrp_date_range.json
pipenv run python s3_to_sfts.py -c config.d/pmrp_max_date.json
pipenv run python s3_to_sfts.py -c config.d/pmrp_qdata_daily.json
pipenv run python s3_to_sfts.py -c config.d/pmrp_qdata_dates.json
pipenv run python s3_to_sfts.py -c config.d/pmrp_qdata_range.json
5. Check to see if the files appear in the s3 processed good bucket:
- pmrp_all: https://s3.console.aws.amazon.com/s3/buckets/sp-ca-bc-gov-131565110619-12-microservices?region=ca-central-1&prefix=processed/good/client/doug_test/GDXDSD-5362/pmrp_gdx/pmrp_all/&showversions=false
- pmrp_date_range: https://s3.console.aws.amazon.com/s3/buckets/sp-ca-bc-gov-131565110619-12-microservices?region=ca-central-1&prefix=processed/good/client/doug_test/GDXDSD-5362/pmrp_gdx/pmrp_date_range/&showversions=false
- pmrp_max_date: https://s3.console.aws.amazon.com/s3/buckets/sp-ca-bc-gov-131565110619-12-microservices?region=ca-central-1&prefix=processed/good/client/doug_test/GDXDSD-5362/pmrp_gdx/pmrp_max_date/&showversions=false
- pmrp_qdata_daily: https://s3.console.aws.amazon.com/s3/buckets/sp-ca-bc-gov-131565110619-12-microservices?region=ca-central-1&prefix=processed/good/client/doug_test/GDXDSD-5362/pmrp_qdata/daily/Jun_2022_change/&showversions=false
- pmrp_qdata_dates: https://s3.console.aws.amazon.com/s3/buckets/sp-ca-bc-gov-131565110619-12-microservices?region=ca-central-1&prefix=processed/good/client/doug_test/GDXDSD-5362/pmrp_qdata/dates/Jun_2022_change/&showversions=false
- pmrp_qdata_range: https://s3.console.aws.amazon.com/s3/buckets/sp-ca-bc-gov-131565110619-12-microservices?region=ca-central-1&prefix=processed/good/client/doug_test/GDXDSD-5362/pmrp_qdata/range/Jun_2022_change/&showversions=false
6. Check to see if the files appear in SFTS
- pmrp_all: https://filetransfer.gov.bc.ca/human.aspx?r=1052649598&arg06=933537552&arg12=filelist
- pmrp_date_range: https://filetransfer.gov.bc.ca/human.aspx?r=1052649598&arg06=933537552&arg12=filelist
- pmrp_max_date: https://filetransfer.gov.bc.ca/human.aspx?r=1052649598&arg06=933537552&arg12=filelist
- pmrp_qdata_daily: https://filetransfer.gov.bc.ca/human.aspx?r=1742300809&orgid=9585&rd=1
- pmrp_qdata_dates: https://filetransfer.gov.bc.ca/human.aspx?r=1742300809&orgid=9585&rd=1
- pmrp_qdata_range: https://filetransfer.gov.bc.ca/human.aspx?r=1742300809&orgid=9585&rd=1
This PR does the following:
To review the changes:
Testing these changes require modifications to the config files. You can see these changes by looking at the attached zip file to this ticket
Testing redshift_to_s3
Report: redshift_to_s3.py
Config: config.d/pmrp_all.json
DML: pmrp_date_range.sql
Requested Dates: 20180929 to 20230128
Microservice started at: 2023-02-07 12:59:53-0800 (PST), ended at: 2023-02-07 13:00:05-0800 (PST), elapsing: 0:00:12.011171.
Objects to process: 1 Objects loaded to S3: 1/1 Objects successful loaded to S3: 1
List of objects successfully loaded to S3
Testing s3_to_sfts
Config: config.d/pmrp_all.json
Microservice started at: 2023-02-07 13:20:20-0800 (PST), ended at: 2023-02-07 13:20:26-0800 (PST), elapsing: 0:00:06.422713.
Items to process: 1 Objects successfully processed to s3: 1 Objects unsuccessfully processed to s3: 0 Objects successfully processed to sfts: 1
Objects loaded to S3 /good:
1: processed/good/client/doug_test/GDXDSD-5362/pmrp_gdx/pmrp_all/pmrp_20180929_20230128_20230207T205953_part000