Closed tiernanmartin closed 6 years ago
Major problem: I have no way to geolocate parcels with PINs that are absent in the 2018 data. This means that I cannot associate a census tract with these parcels and therefore they can't be included in the summary statistics for each tract.
The solution I decided on is to exclude records that do not have a matching PIN in the 2018 data.
There are approximately 5000 records that meet this description which is less than 1% of the total number of parcels. Additionally, it is likely that many of the excluded parcels are not single family or condos so they would be excluded anyway.
Currently, the typology excludes parcels that don't have complete records during the three observation periods: 2005, 2010, 2018.
This makes for "cleaner" data transformation steps but it excludes an important category of the housing market: newly-constructed residences.
Change the data prep steps to include these.
Note: this probably means adding
list() %>% reduce(full_join)
in a few places.