It looks like our logic in vw_pin_appeal for including ~2020 appeals is potentially wrong (or the underlying data has changed). This PR updates the logic to take the correct set of columns for 2020.
Current output
You can see from the table below that the change and reason code columns are entirely null for 2020. Each cell is the null proportion for that year and column.
See query
```sql
SELECT
year,
ROUND(SUM(CAST(change IS NULL AS int)) / CAST(COUNT(*) AS double), 3) AS change_np,
ROUND(SUM(CAST(reason_code1 IS NULL AS int)) / CAST(COUNT(*) AS double), 3) AS reason1_np,
ROUND(SUM(CAST(reason_code2 IS NULL AS int)) / CAST(COUNT(*) AS double), 3) AS reason2_np,
ROUND(SUM(CAST(reason_code3 IS NULL AS int)) / CAST(COUNT(*) AS double), 3) AS reason3_np,
ROUND(SUM(CAST(agent_code IS NULL AS int)) / CAST(COUNT(*) AS double), 3) AS agent_code_np,
ROUND(SUM(CAST(agent_name IS NULL AS int)) / CAST(COUNT(*) AS double), 3) AS agent_name_np
FROM default.vw_pin_appeal
WHERE year >= '2017'
GROUP BY year
ORDER BY year DESC
```
year
change_np
reason1_np
reason2_np
reason3_np
agent_code_np
agent_name_np
2024
0.093
0.093
0.962
0.998
0.22
0.22
2023
0.071
0.071
0.948
0.991
0.299
0.299
2022
0.001
0.001
0.963
0.994
0.167
0.167
2021
0.0
0.0
0.987
1.0
0.131
0.131
2020
1.0
1.0
1.0
1.0
1.0
1.0
2019
0.1
0.1
0.929
1.0
1.0
1.0
2018
0.158
0.158
0.846
1.0
1.0
1.0
2017
0.282
0.281
0.901
1.0
1.0
1.0
After PR
The updated view populates the change column for 2020 and thus reduces the null proportion. However, the other columns (reason codes, agent names) don't seem to be populated in any of the potential columns, so the null proportion is still 1.
See query
```sql
SELECT
year,
ROUND(SUM(CAST(change IS NULL AS int)) / CAST(COUNT(*) AS double), 3) AS change_np,
ROUND(SUM(CAST(reason_code1 IS NULL AS int)) / CAST(COUNT(*) AS double), 3) AS reason1_np,
ROUND(SUM(CAST(reason_code2 IS NULL AS int)) / CAST(COUNT(*) AS double), 3) AS reason2_np,
ROUND(SUM(CAST(reason_code3 IS NULL AS int)) / CAST(COUNT(*) AS double), 3) AS reason3_np,
ROUND(SUM(CAST(agent_code IS NULL AS int)) / CAST(COUNT(*) AS double), 3) AS agent_code_np,
ROUND(SUM(CAST(agent_name IS NULL AS int)) / CAST(COUNT(*) AS double), 3) AS agent_name_np
FROM z_ci_dfsnow_fix_appeal_view_default.vw_pin_appeal
WHERE year >= '2017'
GROUP BY year
ORDER BY year DESC
```
It looks like our logic in
vw_pin_appeal
for including ~2020 appeals is potentially wrong (or the underlying data has changed). This PR updates the logic to take the correct set of columns for 2020.Current output
You can see from the table below that the change and reason code columns are entirely null for 2020. Each cell is the null proportion for that year and column.
See query
```sql SELECT year, ROUND(SUM(CAST(change IS NULL AS int)) / CAST(COUNT(*) AS double), 3) AS change_np, ROUND(SUM(CAST(reason_code1 IS NULL AS int)) / CAST(COUNT(*) AS double), 3) AS reason1_np, ROUND(SUM(CAST(reason_code2 IS NULL AS int)) / CAST(COUNT(*) AS double), 3) AS reason2_np, ROUND(SUM(CAST(reason_code3 IS NULL AS int)) / CAST(COUNT(*) AS double), 3) AS reason3_np, ROUND(SUM(CAST(agent_code IS NULL AS int)) / CAST(COUNT(*) AS double), 3) AS agent_code_np, ROUND(SUM(CAST(agent_name IS NULL AS int)) / CAST(COUNT(*) AS double), 3) AS agent_name_np FROM default.vw_pin_appeal WHERE year >= '2017' GROUP BY year ORDER BY year DESC ```After PR
The updated view populates the change column for 2020 and thus reduces the null proportion. However, the other columns (reason codes, agent names) don't seem to be populated in any of the potential columns, so the null proportion is still 1.
See query
```sql SELECT year, ROUND(SUM(CAST(change IS NULL AS int)) / CAST(COUNT(*) AS double), 3) AS change_np, ROUND(SUM(CAST(reason_code1 IS NULL AS int)) / CAST(COUNT(*) AS double), 3) AS reason1_np, ROUND(SUM(CAST(reason_code2 IS NULL AS int)) / CAST(COUNT(*) AS double), 3) AS reason2_np, ROUND(SUM(CAST(reason_code3 IS NULL AS int)) / CAST(COUNT(*) AS double), 3) AS reason3_np, ROUND(SUM(CAST(agent_code IS NULL AS int)) / CAST(COUNT(*) AS double), 3) AS agent_code_np, ROUND(SUM(CAST(agent_name IS NULL AS int)) / CAST(COUNT(*) AS double), 3) AS agent_name_np FROM z_ci_dfsnow_fix_appeal_view_default.vw_pin_appeal WHERE year >= '2017' GROUP BY year ORDER BY year DESC ```@ccao-jardine FYI