CityofToronto / bdit_data-sources

Data sources used by the Big Data Innovation Team
https://github.com/orgs/CityofToronto/teams/bigdatainnovationteam
GNU General Public License v3.0
38 stars 8 forks source link

Make Text to Centreline Case Insensitive #628

Open radumas opened 1 year ago

radumas commented 1 year ago

From https://github.com/CityofToronto/bdit_data-sources/pull/421

Currently, gis.text_to_centreline() fails if the input text is capitalized, e.g.:

SELECT *
FROM gis.text_to_centreline(111,'BLOOR ST W', 'BETWEEN SHAW ST AND OSSINGTON AVE', NULL);

The problem has to do with using SPLIT_PART on fixed lower-case text and in the function function-clean_bylaws_text.sql which gis.text_to_centreline() calls. The 'Between' is also only checked for first letter capitalization.

Unfortunately that PR introduced more complexity to an already rather complicated set of functions. Opening this issue for further improvements.

radumas commented 1 year ago

618 has a fix for the "point" and "of" keywords

gabrielwol commented 2 months ago

Seems like it should be a one line fix, altering the function inputs in the DECLARE statement like I did below: https://github.com/CityofToronto/bdit_data-sources/blob/d6596433c5a76e945ef50fbe307724c5238feddd/volumes/miovision/sql/function/function-identify-zero-counts.sql#L12-L13

gabrielwol commented 2 months ago

Note to future doer: briefly tried to address as above while fixing #1021 and discovered it would require a lot of changes to the regex, particularly in abbr_street and clean_bylaws_text.