Closed Nate-Wessel closed 3 months ago
@Nate-Wessel can you confirm that the non-integer codes (i.e. 1A
and 1B
) only exist for invtype
?
In the MOVE collision ETL pipeline, we remap those to integer values: https://github.com/CityofToronto/bdit_move_etl/blob/a4a673d7567998a5b1b2e3e6d9b952ddb3619870/scripts/airflow/tasks/crash_geocoding/A3_involved_fields_norm.sql#L16
1A
=> 97
1B
=> 98
We also replaced these values in the MOVE copy of the collision factors tables.
As such, you won't see any collisions with invtype
values 1A
and 1B
if using the new collisions schema.
Here's my suggested approach:
collision_factors.invtype
to replace the non-integer codes like we did in the MOVE database? @radumas do you have any concerns with the second approach (replacing the data)?What do you think?
I'm ok with updating collision_factors.invtype
to replace
1A => 97
1B => 98
and then these lookup tables should get updated from the MOVE side at some point. @tahaislam can take care of this.
can you confirm that the non-integer codes (i.e. 1A and 1B) only exist for invtype?
Some notes on looking through all the tables in this schema:
collision_factors.cyccond
is emptycollision_factors.imploc
permission denied for table imploc::int
able text values. Confirming that does make these joins, collectively, a lot easier. Still, I'm always wary of needing to convert a type like that. Permission granted to imploc
. It contains zero-padded numbers and a literal 'NULL'
I suggest UPDATE collision_factors.imploc SET imploc = NULL WHERE imploc = 'NULL'
and then we can close this issue?
FYI imploc is not being replicated from Move.
I suggest
UPDATE collision_factors.imploc SET imploc = NULL WHERE imploc = 'NULL'
and then we can close this issue? FYI imploc is not being replicated from Move.
Has anything else been done to address this issue? There were some suggestions above. Have any of them been implemented?
I see that the numeric fields in these tables are still text, often zero-padded. I still think that joining smallints to a text field like that is asking for trouble.
I think the suggestions made by @mkewins could be transfered to MOVE where these tables are being replicated from?
In the meantime,
FROM collisions.involved
LEFT JOIN collision_factors.invtype ON involved.invtype = invtype.invtype::int
(and similar for other collision_factors tables) works now that 1A, 1B are replaced.
I suggest
UPDATE collision_factors.imploc SET imploc = NULL WHERE imploc = 'NULL'
and then we can close this issue? FYI imploc is not being replicated from Move.
Closing as in MOVE's backlog
.
The collisions tables currently in vogue,
collisions.events
andcollisions.involved
, have many smallint type columns for the various coded fields. However the corresponding tables incollision_factors
are often(always?) text fields. Sometimes a join will work, or partially work, in a way that can easily fool you into thinking it worked as you expected.This query however doesn't return any of the most important categories which are coded with leading zeros, as shown here in
collision_factors.invtype
:Casting the type in the other direction (from text to int) breaks when it encounters the
1A
and1B
values.How should these tables be joined? I think that coming up with my own logic for this on an ad hoc basis (and for every field!) will be very error prone.