UBC-MDS / rhousehunter

Simplify House-Hunting
https://ubc-mds.github.io/rhousehunter/
MIT License
2 stars 3 forks source link

Function 2: Cleaner #6

Closed JuntingHe closed 3 years ago

JuntingHe commented 3 years ago

Function to clean web-scraped data with Pandas and Regex

JuntingHe commented 3 years ago

Function 2 takes in the scraped tibble scarping from Function 1 and returns a cleaned tibble object containing information like listing url, price, number of bedroom, area in sqft, and city and ready for filtering.

JuntingHe commented 3 years ago

Ideas/notes that are related to the cleaner function .

JuntingHe commented 3 years ago

In designing the function, I tried to use databases of the cities like world.cities or canada.cities to verify they is actually city information in the city column but in fact, some cities like "Burnaby" are missing in those datasets. Hence, I decided to use city list instead of those database.