Closed atdservicebot closed 7 months ago
The fatal counts should be as follows:
2022:
Motorist = 47 Pedestrian = 46 (There is an "extra" Pedestrian, but I can't find why) Motorcyclist = 21 Bicyclist = 1 e-Scooter Rider = 2 Other = 0
2023:
Motorist = 37 Pedestrian = 38 (The missing one is being flagged as "Other" somehow) Motorcyclist = 11 (This one has the units swapped 19749149) Bicyclist = 6 e-Scooter Rider = 0 Other = 0
Approach to resolving count
This is the column in the crash data: atd_mode_category_metadata
sum(pedestrian_death_count)
Please keep in mind that one of the fatalities has the person killed in the wrong unit. I don't know how that happened, but the data is off and causing a miscount.
For reference here are the queries from last year's discrepancy: https://github.com/cityofaustin/atd-data-tech/issues/11340
Evaluate this Issue to see if there's a simple fix, if not we can take last year's approach to hard code the right values and spin off another to find the fix. Refer to Issue above from last year to see if the queries point to the problem with the inaccurate counts.
maybe look at this issue too? https://github.com/cityofaustin/atd-data-tech/issues/2413
@xavierapostol could you share your queries with us to aid in untangling the mismatch? aka how you got the "correct" fatality numbers for 2023? thanks
Found an example of a crash where the atd_mode_category_metadata
field has desynced with the units.
https://visionzero.austin.gov/editor/#/crashes/19609765
atd_mode_category_metadata
shows a "Other/Unknown" mode with a fatality.
"mode_id":9,"mode_desc":"Other/Unknown","unit_id":4767930,"death_cnt":1,"sus_serious_injry_cnt":0,"nonincap_injry_cnt":0,"poss_injry_cnt":0,"non_injry_cnt":0,"unkn_injry_cnt":0,"tot_injry_cnt":0}
meanwhile the crash demographics dataset shows a mode of pedestrian:
Here is my analysis of some of the Vision Zero Editor "VZE" (I don't know how the VZV is extracting data) mismatches. I've also included the query that looks at every fatality. Fatalities VZE vs VZdb - 20150101 to 20240208.xlsx
------------------------------------------ (Each Row Represents a Person) WITH up AS ( SELECT atp.person_id AS "prsn_id", atp.crash_id, atp.unit_nbr, atp.prsn_nbr, atisl.injry_sev_desc, atu.unit_id, CASE WHEN atu.unit_desc_id IS NULL THEN 'MOTOR VEHICLE' ELSE atvudl.veh_unit_desc_desc END AS "unit_desc_desc", CASE WHEN atu.veh_body_styl_id IS NULL THEN 'UNKNOWN' ELSE atvbsl.veh_body_styl_desc END AS "veh_body_styl_desc", CASE WHEN atp.prsn_injry_sev_id = 4 THEN 1 ELSE 0 END AS "_k", CASE WHEN atp.prsn_injry_sev_id = 1 THEN 1 ELSE 0 END AS "_a", CASE WHEN atp.prsn_injry_sev_id = 2 THEN 1 ELSE 0 END AS "_b", CASE WHEN atp.prsn_injry_sev_id = 3 THEN 1 ELSE 0 END AS "_c", CASE WHEN atp.prsn_injry_sev_id = 5 THEN 1 ELSE 0 END AS "_n", CASE WHEN atp.prsn_injry_sev_id NOT IN ( 1, 2, 3, 4, 5 ) OR atp.prsn_injry_sev_id IS NULL THEN 1 ELSE 0 END AS "_u" FROM atd_txdot_person AS atp LEFT JOIN atd_txdot_units AS atu ON atu.crash_id = atp.crash_id AND atu.unit_nbr = atp.unit_nbr LEFT JOIN atd_txdotveh_unit_desc_lkp AS atvudl ON atvudl.veh_unit_desc_id = atu.unit_desc_id LEFT JOIN atd_txdotveh_body_styl_lkp AS atvbsl ON atvbsl.veh_body_styl_id = atu.veh_body_styl_id LEFT JOIN atd_txdotinjry_sev_lkp AS atisl ON atisl.injry_sev_id = atp.prsn_injry_sev_id UNION ALL SELECT atp2.primaryperson_id AS "prsn_id", atp2.crash_id, atp2.unit_nbr, atp2.prsn_nbr, atisl2.injry_sev_desc, atu2.unit_id, CASE WHEN atu2.unit_desc_id IS NULL THEN 'MOTOR VEHICLE' ELSE atvudl2.veh_unit_desc_desc END AS "unit_desc_desc", CASE WHEN atu2.veh_body_styl_id IS NULL THEN 'UNKNOWN' ELSE atvbsl2.veh_body_styl_desc END AS "veh_body_styl_desc", CASE WHEN atp2.prsn_injry_sev_id = 4 THEN 1 ELSE 0 END AS "_k", CASE WHEN atp2.prsn_injry_sev_id = 1 THEN 1 ELSE 0 END AS "_a", CASE WHEN atp2.prsn_injry_sev_id = 2 THEN 1 ELSE 0 END AS "_b", CASE WHEN atp2.prsn_injry_sev_id = 3 THEN 1 ELSE 0 END AS "_c", CASE WHEN atp2.prsn_injry_sev_id = 5 THEN 1 ELSE 0 END AS "_n", CASE WHEN atp2.prsn_injry_sev_id NOT IN ( 1, 2, 3, 4, 5 ) OR atp2.prsn_injry_sev_id IS NULL THEN 1 ELSE 0 END AS "_u" FROM atd_txdot_primaryperson AS atp2 LEFT JOIN atd_txdot_units AS atu2 ON atu2.crash_id = atp2.crash_id AND atu2.unit_nbr = atp2.unit_nbr LEFT JOIN atd_txdotveh_unit_desc_lkp AS atvudl2 ON atvudl2.veh_unit_desc_id = atu2.unit_desc_id LEFT JOIN atd_txdotveh_body_styl_lkp AS atvbsl2 ON atvbsl2.veh_body_styl_id = atu2.veh_body_styl_id LEFT JOIN atd_txdotinjry_sev_lkp AS atisl2 ON atisl2.injry_sev_id = atp2.prsn_injry_sev_id ), p_agg AS ( SELECT agg.crash_id, atisl2.injry_sev_desc AS "crash_sev" FROM ( SELECT up.crash_id, SUM ( up._k ) AS "k_cnt", SUM ( up._a ) AS "a_cnt", SUM ( up._b ) AS "b_cnt", SUM ( up._c ) AS "c_cnt", SUM ( up._n ) AS "n_cnt", SUM ( up._u ) AS "u_cnt" FROM up GROUP BY up.crash_id ) AS agg LEFT JOIN atd_txdot__injry_sev_lkp AS atisl2 ON atisl2.injry_sev_id = CASE WHEN agg.k_cnt > 0 THEN 4 WHEN agg.a_cnt > 0 THEN 1 WHEN agg.b_cnt > 0 THEN 2 WHEN agg.c_cnt > 0 THEN 3 WHEN agg.n_cnt > 0 THEN 5 ELSE 0 END ) SELECT atc.crash_date, atc.crash_time, p_agg.crash_sev, up.unit_nbr, up.prsn_nbr, up.injry_sev_desc, up.unit_desc_desc, up.veh_body_styl_desc, up.unit_id, atc.crash_id, up.prsn_id FROM atd_txdot_crashes AS atc LEFT JOIN up ON up.crash_id = atc.crash_id LEFT JOIN p_agg ON p_agg.crash_id = atc.crash_id WHERE atc.private_dr_fl = 'N' AND atc.in_austin_full_purpose = TRUE AND atc.crash_date BETWEEN '2015-01-01' AND '2024-02-08' AND up.injry_sev_desc = 'FATAL INJURY'
;
Also, this fatality is showing up as "Other" when it should be a "Pedestrian". https://visionzero.austin.gov/editor/#/crashes/19609765
Summarizing 2023 fatality differences:
Update on this, I stubbed out a query that I feel good about using to see where these mismatches are coming from. Some adjustments are needed to make the right comparisons, but, so far, the total fatalities match with Xavier's query: 117 in 2022 (1 less than VZV) and 91 in 2023 (same as VZV). Next steps are to drill down into the crash IDs that Charlie and Xavier noted and then see if there is an issue with a trigger, if the numbers can be fixed through VZE edits in the meantime, and if there is further work needed.
Query mode fatalities using the people tables
WITH people_crash_modes AS (
-- combine primary person and person fatalities
WITH pf AS (
SELECT
p.crash_id,
p.unit_nbr,
p.prsn_nbr,
p.prsn_injry_sev_id,
p.primaryperson_id AS person_id
FROM
atd_txdot_primaryperson AS p
WHERE
prsn_injry_sev_id = 4
UNION ALL
SELECT
pp.crash_id,
pp.unit_nbr,
pp.prsn_nbr,
pp.prsn_injry_sev_id,
pp.person_id
FROM
atd_txdot_person AS pp
WHERE
prsn_injry_sev_id = 4
)
SELECT
pf.person_id,
pf.crash_id,
pf.unit_nbr,
pf.prsn_nbr,
pf.prsn_injry_sev_id,
c.atd_fatality_count,
c.death_cnt AS crash_death_cnt,
c.atd_mode_category_metadata,
c.crash_date,
u.unit_id,
u.death_cnt as unit_death_cnt,
u.atd_mode_category AS unit_mode_category,
-- unpack the values from the crashes atd_mode_category_metadata column
(WITH mode_metadata_units (
elem
) AS (
SELECT
json_array_elements(atd_mode_category_metadata)
FROM
atd_txdot_crashes
WHERE
crash_id = c.crash_id),
mode_metadata_unit_ids AS (
SELECT
(elem -> 'unit_id')::text::int AS unit_id,
(elem -> 'mode_id')::text::int AS mode_id
FROM
mode_metadata_units
)
SELECT
mode_id
FROM
mode_metadata_unit_ids
WHERE
unit_id = u.unit_id) AS metadata_mode_id,
(WITH mode_metadata_units (
elem
) AS (
SELECT
json_array_elements(atd_mode_category_metadata)
FROM
atd_txdot_crashes
WHERE
crash_id = c.crash_id),
mode_metadata_unit_ids AS (
SELECT
(elem -> 'unit_id')::text::int AS unit_id,
(elem -> 'death_cnt')::text::int AS death_cnt
FROM
mode_metadata_units
)
SELECT
death_cnt
FROM
mode_metadata_unit_ids
WHERE
unit_id = u.unit_id) AS metadata_death_cnt
FROM
pf
-- join crashes and units to persons so we can get related data
LEFT JOIN atd_txdot_crashes AS c ON pf.crash_id = c.crash_id
LEFT JOIN atd_txdot_units AS u ON u.crash_id = pf.crash_id
AND u.unit_nbr = pf.unit_nbr
WHERE
-- apply date range and match filters in Xavier's query
TRUE
AND c.private_dr_fl = 'N'
AND c.in_austin_full_purpose = TRUE
AND c.crash_date BETWEEN '2022-01-01'
AND '2022-12-31'
)
SELECT
*
FROM
people_crash_modes
WHERE
TRUE
-- AND atd_mode_category != metadata_unit_id
-- AND crash_id = 19113648
-- pedestrian mode filter
-- AND atd_fatality_count > 0
-- AND metadata_death_cnt > 0
-- motor vehicle
-- AND metadata_mode_id IN (1,2,4)
-- pedestrian
-- AND metadata_mode_id IN (7)
-- e-scooter/micromobility
-- AND metadata_mode_id IN (10,11)
-- bicycle
-- AND metadata_mode_id IN (5)
-- motorcycle
-- AND metadata_mode_id IN (3)
-- other
-- AND metadata_mode_id IN (6,8,9)
Query mode fatalities in the atd_mode_category_metadata
crash column
WITH crash_modes AS (
SELECT
crash_id,
(metadata ->> 'unit_id')::int AS unit_id,
(metadata ->> 'death_cnt')::int AS death_cnt,
(metadata ->> 'mode_id')::int AS metadata_mode_id,
atd_fatality_count,
crash_date
FROM (
SELECT
crash_id,
json_array_elements(atd_mode_category_metadata) AS metadata,
atd_fatality_count,
crash_date
FROM
atd_txdot_crashes
WHERE
TRUE
AND private_dr_fl = 'N'
AND in_austin_full_purpose = TRUE) m
)
SELECT
crash_id, unit_id, death_cnt, metadata_mode_id, atd_fatality_count, crash_date
FROM
crash_modes
WHERE
TRUE
-- AND crash_id = 19113648
-- motor vehicle
-- AND metadata_mode_id IN (1,2,4)
-- pedestrian
AND metadata_mode_id IN (7)
-- e-scooter/micromobility
-- AND metadata_mode_id IN (10,11)
-- bicycle
-- AND metadata_mode_id IN (5)
-- motorcycle
-- AND metadata_mode_id IN (3)
-- other
-- AND metadata_mode_id IN (6,8,9)
AND death_cnt > 0
-- AND atd_fatality_count > 0
AND crash_date BETWEEN '2022-01-01'
AND '2022-12-31'
@xavierapostol For 2022, it looks like the extra pedestrian is coming from https://visionzero.austin.gov/editor/#/crashes/19113648. Could you check out that record when you can? I see two units with fatalities (one pedestrian) in the units accordion and only one person fatality in the people accordion (no pedestrian). I think an edit should fix that one. Edit: Just like i noted in my next reply. An edit to the crash data on this page will have to happen in order for the unit data to recalculate. Thanks! Still looking at 2023.
I found this record by comparing a version of the people-based query in my last comment (returned 46 pedestrian fatalities like your query) against the crash-unit based query that feeds VZV from Socrata (returns 47 records).
For 2023:
atd_mode_category_metadata
is not fired when a unit record is updated. So, the data in this column desyncs from unit edits made in VZE. I think this is a must fix.Xavier - after making these changes in my local test environment, both your query above (with added 2023 date range + unit_desc_desc = PEDESTRIAN filters) and my people query (with same filters) both produce 37 pedestrian fatalities for 2023. I think this covers the mismatches, but happy to work on this more if I've missed something.
@xavierapostol For 2022, it looks like the extra pedestrian is coming from https://visionzero.austin.gov/editor/#/crashes/19113648. Could you check out that record when you can? I see two units with fatalities (one pedestrian) in the units accordion and only one person fatality in the people accordion (no pedestrian). I think an edit should fix that one. Edit: Just like i noted in my next reply. An edit to the crash data on this page will have to happen in order for the unit data to recalculate. Thanks! Still looking at 2023.
I found this record by comparing a version of the people-based query in my last comment (returned 46 pedestrian fatalities like your query) against the crash-unit based query that feeds VZV from Socrata (returns 47 records).
This is a strange one and because I cannot edit the details of the person
type, this fatality is populating strangely. What is on the CR3 is 3 units with a total of 8 persons.
What it should be is:
@xavierapostol I see exactly what you are saying. Thanks for highlighting the connection between the data and the narrative. 🙏 I think the main issue we are facing is that unit fatalities are what ultimately show up on VZV in the crash mode visualization by way of a crash row's atd_mode_category_metadata
column. What I'm going to suggest is a holdover until we can update VZE to allow you to edit the people records type field.
So, I propose:
atd_mode_category_metadata
column for crash row with crash ID 19113648
in the production read replica, and you will see two objects in the json array with death_cnt = 1
(one Pedestrian and one Large passenger vehicle). Then, update https://visionzero.austin.gov/editor/#/crashes/19113648 to have one fatality on Unit 4 and no fatalities on Unit 2. Then, change and save a crash field like the Suspected Serious Injury Count to a new value and then change it back and save (to trigger the unit metadata recalculation). Look at the atd_mode_category_metadata
column for this crash again, you will see only one Pedestrian fatality.
atd_mode_category_metadata
so that it reacts to crash updates and unit updates. Or, we can look at other ways to address the de-sync issue. cc @johnclary Let me know if this sounds good to you, and I'm attaching a spreadsheet that I used to evaluate the modes produced by all of our queries.
Edit: forgot to mention that I updated my SQL queries in this comment above
I tried my best to match the persons to the unit, but the VZE does not allow me to change certain attributes. I hope this worked (fingers crossed)
Looks great to me, Xavier! I'm working on an update so that edits in the units accordion make their way into the crash unit metadata that powers the VZV By Mode viz. I'll keep you in the loop on that.
Since Mike fixed the underlying issue with crash-mode data desync, i was able to make minor edits (which i immediately undid) to these two crashes, and that triggered the crash-level mode metadata to rebuild:
For this 2022 crash, it looks like Xavier's edit per Mike's instruction fixed the metadata desync:
I re-ran the VZD > Data Portal export, but unfortunately we still have an extra ped fatality in 2022. i'm going to have to dig into this next week.
@xavierapostol can you re-confirm the expected pedestrian fatality total for 2022? i just reran your query and i'm now seeing the following numbers, which do now match VZV:
So I think we cleared all this up by fixing the mismatched units and re-syncing the mode data. I am ready to close this if you are, Xavier!
@johnclary, the total count for 2022 should be as follows:
The numbers you have equal 118.
This crash still has a fatal_crash_fl
= 'N'
This crash is a multi-fatality crash with 1 pedestrian
and 1 motor vehicle
passenger
My bad, our numbers are aligned at 117 total fatalities in 2022. I'm closing this issue.
Vision Zero Viewer
There are some fatalities counts by mode that do not match what is in the Vision Zero Editor.
-
https://visionzero.austin.gov/viewer/
Internet Browser: Edge
Request ID: DTS24-112320