Closed AmandaDoyle closed 2 years ago
Seems like dropping the million bins got dropped in an earlier refactor https://github.com/NYCPlanning/db-developments/blob/6bba931b1ef661f9cb8500fcd86584752121b4e6/sql/_geo.sql#L72
@AmandaDoyle what are the geo_bins
in the _GEO_devdb
table? Not essential to solving this problem but we are curious
Our fix to filter out million bins in _geo.sql
has introduced a new error in _hny.sql
.
We get this error when trying to build via 02_build_devdb.sh
psql:sql/_hny.sql:431: ERROR: invalid input syntax for type integer: "B00558782"
Looking into the code we see that the error is thrown when the many_to_one
CTE is unioned to one_to_many
. It seems to be a data type issue. First we thought the issue was in the union of one_to_many
and many_to_one
. But the error is thrown any time many_to_one
is queried so it is actually in the creation of many_to_one
.
Our solution was based off the observation that the HNY_matches
table is split into one_to_one
, one_to_many
, and many_to_one
CTE. So we thought that sending job_number
to text when HNY_matches
was created was the cleanest way to resolve.
That didn't work so we tried casting job numbers in the creation of many_to_one
. However that didn't work either so maybe our original understanding of the problem was wrong. Or the difficulty lies with our lack of SQL proficiency. Either way we have not made much progress on this issue.
@SashaWeinstein can you write out what your solution was to filter out million bins in _geo.sql
? Similar to how I wrote out my solution in the initial comment.
Addressed in pull request 490
Add clause here and here in
_geo.sq
to that million bins cannot join to each other addingWHERE b.bin IS NOT LIKE
%000000` after each ON statement should do the trick