Closed jwisdom-harrys closed 11 months ago
@jwisdom-harrys could we also try running some commands? That would be safe to do as there is low test coverage. We could try something like Arthur bootstrap (which reads and write yaml) and an Arthur extract/load on an S3 source since those don't need sqoop so you can run them locally from the container.
could we also try running some commands?
Sure thing. It looks like arthur isn't having trouble loading the yaml appropriately or writing it
(aws:de-dev, prefix:jwisdom) $ arthur.py ping
2023-07-18 15:09:27 - INFO - Starting log for redshift_etl v1.65.0 with ETL ID 73617C6251A24DB0
2023-07-18 15:09:27 - INFO - Command line: "/opt/local/redshift_etl/venv/bin/arthur.py ping"
2023-07-18 15:09:27 - INFO - Release information: toplevel=/opt/src/arthur-redshift-etl, commit=3fa60fd065852463e109018449ddffe5e453169b (v1.64.0), date=2023-02-13 13:39:13 -0500
2023-07-18 15:09:27 - INFO - Loading settings from '/opt/src/arthur-redshift-etl/python/etl/config/default_settings.yaml'
2023-07-18 15:09:27 - INFO - Loading settings from '/opt/data-warehouse/config_data_development/aws.yaml'
2023-07-18 15:09:27 - INFO - Loading environment variables from '/opt/data-warehouse/config_data_development/credentials.sh'
2023-07-18 15:09:27 - INFO - Loading settings from '/opt/data-warehouse/config_data_development/harrys.yaml'
2023-07-18 15:09:27 - INFO - Loading settings from '/opt/data-warehouse/config_data_development/harrys_dev.yaml'
2023-07-18 15:09:28 - INFO - Connecting to: host=polaris.dev.harrys.systems port=5439 dbname=development user=etl password=***
and
(aws:de-dev, prefix:jwisdom) $ rm schemas/harryswww/public-wholesalers.yaml
(aws:de-dev, prefix:jwisdom) $ arthur.py bootstrap_sources harryswww
2023-07-18 15:21:34 - INFO - Starting log for redshift_etl v1.65.0 with ETL ID 66493C612CDD4DEE
2023-07-18 15:21:34 - INFO - Command line: "/opt/local/redshift_etl/venv/bin/arthur.py bootstrap_sources harryswww"
2023-07-18 15:21:34 - INFO - Release information: toplevel=/opt/src/arthur-redshift-etl, commit=3fa60fd065852463e109018449ddffe5e453169b (v1.64.0), date=2023-02-13 13:39:13 -0500
2023-07-18 15:21:34 - INFO - Loading settings from '/opt/src/arthur-redshift-etl/python/etl/config/default_settings.yaml'
2023-07-18 15:21:34 - INFO - Loading settings from '/opt/data-warehouse/config_data_development/aws.yaml'
2023-07-18 15:21:34 - INFO - Loading environment variables from '/opt/data-warehouse/config_data_development/credentials.sh'
2023-07-18 15:21:34 - INFO - Loading settings from '/opt/data-warehouse/config_data_development/harrys.yaml'
2023-07-18 15:21:34 - INFO - Loading settings from '/opt/data-warehouse/config_data_development/harrys_dev.yaml'
2023-07-18 15:21:34 - INFO - Looking for files locally in 'schemas'
2023-07-18 15:21:34 - INFO - Found 44 matching file(s) for 44 table(s)
2023-07-18 15:21:35 - INFO - Finished loading 44 table design file(s) using 8 threads (0.26s)
2023-07-18 15:21:35 - INFO - Connecting to database source 'harryswww' to look for tables
2023-07-18 15:21:35 - INFO - Connecting to: host=ec2-54-146-214-46.compute-1.amazonaws.com port=5432 dbname=d6cfrafoorjgg5 user=etl_ro password=***
2023-07-18 15:21:35 - INFO - Found 45 table(s) matching patterns; allowlist=['public.authentications', 'public.billing_profiles', 'public.cancellation_survey_options', 'public.cancellation_survey_responses', 'public.checkout_invoices', 'public.checkout_invoices_shave_plans', 'public.credits', 'public.discount_code_batches', 'public.discount_codes', 'public.discount_group_items', 'public.discount_groups', 'public.discount_orders', 'public.discount_product_entitlements', 'public.discounts', 'public.gift_notecards', 'public.incentives', 'public.membership_cancellation_reasons', 'public.membership_cancellations', 'public.membership_events', 'public.membership_programs', 'public.membership_retry_lifecycle_enrollments', 'public.membership_retry_lifecycle_events', 'public.membership_tax_addresses', 'public.memberships', 'public.one_time_shave_plan_additions', 'public.payment_provider_profiles', 'public.redeemed_credits', 'public.shave_plan_events', 'public.shave_plan_retry_lifecycle_enrollments', 'public.shave_plan_retry_lifecycle_events', 'public.shave_plans', 'public.shipping_addresses', 'public.shipping_tiers', 'public.subscriptions', 'public.survey_question_choices', 'public.survey_question_responses', 'public.survey_questions', 'public.surveys', 'public.tos_opt_outs', 'public.user_experiment_participations', 'public.user_experiment_variants', 'public.user_experiments', 'public.users', 'public.viewable_products', 'public.wholesalers'], denylist=['public.api_tokens', 'public.api_applications', 'public.data_migrations', 'public.nav_links', 'public.oauth_*', 'public.schema_migrations', 'public.custom_page_component_container_page_components', 'public.custom_page_component_containers', 'public.custom_page_page_components', 'public.custom_pages', 'public.cx_tasks', 'public.direct_one_time_additions_product_pages', 'public.discount_product_conditions', 'public.fosdick_jobs', 'public.geo_shipping_constraints', 'public.holiday_nav_links', 'public.holiday_shipping_cutoffs', 'public.images', 'public.ios_releases', 'public.page_components', 'public.page_module_image_attributes', 'public.page_module_sections', 'public.page_module_text_attributes', 'public.page_modules', 'public.page_sub_components', 'public.product_pages', 'public.screen_components', 'public.settings', 'public.shipping_class_shipping_types', 'public.shipping_constraints', 'public.tax_rates', 'public.url_redirects', 'public.v_operational_product_properties', 'public.versions', 'public.waitlists', 'public.pg_*'], subset='['harryswww.*']'
2023-07-18 15:21:35 - INFO - Skipping 'public.authentications' from source 'harryswww' because table design already exists: 'schemas/harryswww/public-authentications.yaml'
2023-07-18 15:21:35 - INFO - Skipping 'public.billing_profiles' from source 'harryswww' because table design already exists: 'schemas/harryswww/public-billing_profiles.yaml'
2023-07-18 15:21:35 - INFO - Skipping 'public.cancellation_survey_options' from source 'harryswww' because table design already exists: 'schemas/harryswww/public-cancellation_survey_options.yaml'
2023-07-18 15:21:35 - INFO - Skipping 'public.cancellation_survey_responses' from source 'harryswww' because table design already exists: 'schemas/harryswww/public-cancellation_survey_responses.yaml'
2023-07-18 15:21:35 - INFO - Skipping 'public.checkout_invoices' from source 'harryswww' because table design already exists: 'schemas/harryswww/public-checkout_invoices.yaml'
2023-07-18 15:21:35 - INFO - Skipping 'public.checkout_invoices_shave_plans' from source 'harryswww' because table design already exists: 'schemas/harryswww/public-checkout_invoices_shave_plans.yaml'
2023-07-18 15:21:35 - INFO - Skipping 'public.credits' from source 'harryswww' because table design already exists: 'schemas/harryswww/public-credits.yaml'
2023-07-18 15:21:35 - INFO - Skipping 'public.discount_code_batches' from source 'harryswww' because table design already exists: 'schemas/harryswww/public-discount_code_batches.yaml'
2023-07-18 15:21:35 - INFO - Skipping 'public.discount_codes' from source 'harryswww' because table design already exists: 'schemas/harryswww/public-discount_codes.yaml'
2023-07-18 15:21:35 - INFO - Skipping 'public.discount_group_items' from source 'harryswww' because table design already exists: 'schemas/harryswww/public-discount_group_items.yaml'
2023-07-18 15:21:35 - INFO - Skipping 'public.discount_groups' from source 'harryswww' because table design already exists: 'schemas/harryswww/public-discount_groups.yaml'
2023-07-18 15:21:35 - INFO - Skipping 'public.discount_orders' from source 'harryswww' because table design already exists: 'schemas/harryswww/public-discount_orders.yaml'
2023-07-18 15:21:35 - INFO - Skipping 'public.discount_product_entitlements' from source 'harryswww' because table design already exists: 'schemas/harryswww/public-discount_product_entitlements.yaml'
2023-07-18 15:21:35 - INFO - Skipping 'public.discounts' from source 'harryswww' because table design already exists: 'schemas/harryswww/public-discounts.yaml'
2023-07-18 15:21:35 - INFO - Skipping 'public.gift_notecards' from source 'harryswww' because table design already exists: 'schemas/harryswww/public-gift_notecards.yaml'
2023-07-18 15:21:35 - INFO - Skipping 'public.incentives' from source 'harryswww' because table design already exists: 'schemas/harryswww/public-incentives.yaml'
2023-07-18 15:21:35 - INFO - Skipping 'public.membership_cancellation_reasons' from source 'harryswww' because table design already exists: 'schemas/harryswww/public-membership_cancellation_reasons.yaml'
2023-07-18 15:21:35 - INFO - Skipping 'public.membership_cancellations' from source 'harryswww' because table design already exists: 'schemas/harryswww/public-membership_cancellations.yaml'
2023-07-18 15:21:35 - INFO - Skipping 'public.membership_events' from source 'harryswww' because table design already exists: 'schemas/harryswww/public-membership_events.yaml'
2023-07-18 15:21:35 - INFO - Skipping 'public.membership_programs' from source 'harryswww' because table design already exists: 'schemas/harryswww/public-membership_programs.yaml'
2023-07-18 15:21:35 - INFO - Skipping 'public.membership_retry_lifecycle_enrollments' from source 'harryswww' because table design already exists: 'schemas/harryswww/public-membership_retry_lifecycle_enrollments.yaml'
2023-07-18 15:21:35 - INFO - Skipping 'public.membership_retry_lifecycle_events' from source 'harryswww' because table design already exists: 'schemas/harryswww/public-membership_retry_lifecycle_events.yaml'
2023-07-18 15:21:35 - INFO - Skipping 'public.membership_tax_addresses' from source 'harryswww' because table design already exists: 'schemas/harryswww/public-membership_tax_addresses.yaml'
2023-07-18 15:21:35 - INFO - Skipping 'public.memberships' from source 'harryswww' because table design already exists: 'schemas/harryswww/public-memberships.yaml'
2023-07-18 15:21:35 - INFO - Skipping 'public.one_time_shave_plan_additions' from source 'harryswww' because table design already exists: 'schemas/harryswww/public-one_time_shave_plan_additions.yaml'
2023-07-18 15:21:35 - INFO - Skipping 'public.payment_provider_profiles' from source 'harryswww' because table design already exists: 'schemas/harryswww/public-payment_provider_profiles.yaml'
2023-07-18 15:21:35 - INFO - Skipping 'public.redeemed_credits' from source 'harryswww' because table design already exists: 'schemas/harryswww/public-redeemed_credits.yaml'
2023-07-18 15:21:35 - INFO - Skipping 'public.shave_plan_events' from source 'harryswww' because table design already exists: 'schemas/harryswww/public-shave_plan_events.yaml'
2023-07-18 15:21:35 - INFO - Skipping 'public.shave_plan_retry_lifecycle_enrollments' from source 'harryswww' because table design already exists: 'schemas/harryswww/public-shave_plan_retry_lifecycle_enrollments.yaml'
2023-07-18 15:21:35 - INFO - Skipping 'public.shave_plan_retry_lifecycle_events' from source 'harryswww' because table design already exists: 'schemas/harryswww/public-shave_plan_retry_lifecycle_events.yaml'
2023-07-18 15:21:35 - INFO - Skipping 'public.shave_plans' from source 'harryswww' because table design already exists: 'schemas/harryswww/public-shave_plans.yaml'
2023-07-18 15:21:35 - INFO - Skipping 'public.shipping_addresses' from source 'harryswww' because table design already exists: 'schemas/harryswww/public-shipping_addresses.yaml'
2023-07-18 15:21:35 - INFO - Skipping 'public.shipping_tiers' from source 'harryswww' because table design already exists: 'schemas/harryswww/public-shipping_tiers.yaml'
2023-07-18 15:21:35 - INFO - Skipping 'public.subscriptions' from source 'harryswww' because table design already exists: 'schemas/harryswww/public-subscriptions.yaml'
2023-07-18 15:21:35 - INFO - Skipping 'public.survey_question_choices' from source 'harryswww' because table design already exists: 'schemas/harryswww/public-survey_question_choices.yaml'
2023-07-18 15:21:35 - INFO - Skipping 'public.survey_question_responses' from source 'harryswww' because table design already exists: 'schemas/harryswww/public-survey_question_responses.yaml'
2023-07-18 15:21:35 - INFO - Skipping 'public.survey_questions' from source 'harryswww' because table design already exists: 'schemas/harryswww/public-survey_questions.yaml'
2023-07-18 15:21:35 - INFO - Skipping 'public.surveys' from source 'harryswww' because table design already exists: 'schemas/harryswww/public-surveys.yaml'
2023-07-18 15:21:35 - INFO - Skipping 'public.tos_opt_outs' from source 'harryswww' because table design already exists: 'schemas/harryswww/public-tos_opt_outs.yaml'
2023-07-18 15:21:35 - INFO - Skipping 'public.user_experiment_participations' from source 'harryswww' because table design already exists: 'schemas/harryswww/public-user_experiment_participations.yaml'
2023-07-18 15:21:35 - INFO - Skipping 'public.user_experiment_variants' from source 'harryswww' because table design already exists: 'schemas/harryswww/public-user_experiment_variants.yaml'
2023-07-18 15:21:35 - INFO - Skipping 'public.user_experiments' from source 'harryswww' because table design already exists: 'schemas/harryswww/public-user_experiments.yaml'
2023-07-18 15:21:35 - INFO - Skipping 'public.users' from source 'harryswww' because table design already exists: 'schemas/harryswww/public-users.yaml'
2023-07-18 15:21:35 - INFO - Skipping 'public.viewable_products' from source 'harryswww' because table design already exists: 'schemas/harryswww/public-viewable_products.yaml'
2023-07-18 15:21:36 - INFO - Index 'wholesalers_pkey' of 'public.wholesalers' adds constraint {"primary_key": ["id"]}
2023-07-18 15:21:36 - INFO - Index 'index_wholesalers_on_code_prefix' of 'public.wholesalers' adds constraint {"unique": ["code_prefix"]}
2023-07-18 15:21:36 - INFO - Index 'index_wholesalers_on_name' of 'public.wholesalers' adds constraint {"unique": ["name"]}
2023-07-18 15:21:36 - INFO - Writing new table design file for 'harryswww.wholesalers' to './schemas/harryswww/public-wholesalers.yaml'
2023-07-18 15:21:36 - INFO - Done with 45 table(s) from source 'harryswww'
2023-07-18 15:21:36 - WARNING - New table(s) in source 'harryswww' without local design: 'public.wholesalers'
2023-07-18 15:21:36 - INFO - Ran 'bootstrap_sources' for 1.61s and finished successfully!
Bump PyYAML version to 6.0.1 hotfix that fixes pip installs. See https://github.com/yaml/pyyaml/issues/601.
Testing: Got the test from the hdw-validate job in the
harrys
repo. Before patch:After patch: