ccao-data / data-architecture

Codebase for CCAO data infrastructure construction and management
https://ccao-data.github.io/data-architecture/
5 stars 3 forks source link

Sales val migration q2 2024 ias world spec #503

Closed wagnerlmichael closed 1 week ago

wagnerlmichael commented 2 weeks ago

This PR updates vw_pin_sale and vw_ias_salesval_upload with the new column structure from the sales val migration. We also edit schema and shared_columns files accordingly.

Note: I will need to change the refs back to the prod table at some point after we do the migration.

I am getting a sql fluff error on the pre-commit check that I'm not entirely sure is working correctly, curious about your thoughts @jeancochrane.

Eyeballing these queries confirm that things are not totally out-of-whack:

SELECT sv_is_outlier, COUNT(*)
FROM "z_ci_sales_val_migration_q2_2024_ias_design_doc_default"."vw_pin_sale"
WHERE sv_run_id is not null
GROUP BY sv_is_outlier
ORDER BY COUNT(*) DESC;
SELECT sv_outlier_reason1, COUNT(*)
FROM "z_ci_sales_val_migration_q2_2024_ias_design_doc_default"."vw_pin_sale"
WHERE sv_run_id is not null
GROUP BY sv_outlier_reason1
ORDER BY COUNT(*) DESC;
SELECT sv_outlier_reason2, COUNT(*)
FROM "z_ci_sales_val_migration_q2_2024_ias_design_doc_default"."vw_pin_sale"
WHERE sv_run_id is not null
GROUP BY sv_outlier_reason2
ORDER BY COUNT(*) DESC;
jeancochrane commented 2 weeks ago

Taking a look now! Two things before I dig in:

I am getting a sql fluff error on the pre-commit check that I'm not entirely sure is working correctly, curious about your thoughts

This looks reasonable to me, I think the issue is just that sqlfluff doesn't think the quotes around new_prod_data are necessary because it doesn't contain special characters. If you run pre-commit run --all-files then sqlfluff should fix it for you, but let me know if that doesn't work and I can help troubleshoot.

Also: What do you think is up with these sales that are outliers with no reasons?

SELECT sv_is_outlier, sv_outlier_reason1, sv_outlier_reason2, sv_outlier_reason3, COUNT(*)
FROM "z_ci_sales_val_migration_q2_2024_ias_design_doc_default"."vw_pin_sale"
WHERE sv_run_id IS NOT NULL
AND sv_is_outlier = TRUE
AND sv_outlier_reason1 IS NULL
AND sv_outlier_reason2 IS NULL
AND sv_outlier_reason3 IS NULL
GROUP BY 1, 2, 3, 4
ORDER BY COUNT(*) DESC;