cobalt-uoft / uoft-scrapers

Public web scraping scripts for the University of Toronto.
https://pypi.python.org/pypi/uoftscrapers
MIT License
48 stars 14 forks source link

Sanitize street addresses #19

Closed kashav closed 8 years ago

kashav commented 8 years ago

Some addresses had multiple spaces between street number and name.

Here's an entry from the buildings documentation:

{
    "id":"001",
    ...
    "address":{
      "street":"15  King's College Circle",
      "city":"Toronto",
      "province":"ON",
      "country":"Canada",
      "postal":"M5S 3H7"
    },
    ...
}

Added a simple filter to get rid of the extra spaces.

{
    "id": "001",
    ...
    "address": {
        "street": "15 King's College Circle", 
        "city": "Toronto",
        "province": "ON", 
        "country": "Canada", 
        "postal": "M5S 3H7"
    },
    ...
}
qasim commented 8 years ago

This looks good! Thanks Kashav. I just gotta fix tests in cobalt-uoft/cobalt (since it will yell at the lack of extra spaces) and will merge this. 😊