Closed tcovert closed 5 years ago
Notes to self:
Ideas for matching
Matching:
The one-to-one matches from this round are compiled in the csv file: matched_MGL.csv
The unmatched bid notices after the first round of matching are broken down into three separate categories (each in their own csv file): multiple_abs_num_match.csv - Multiple OTLS abstract number matches no_SURNAM_match.xlsx - No OTLS survey name match missing_abs_num.csv - Bid notice had a missing abstract number associated with the survey name
The second round of matching for the three unmatched files was manual. For each one I went back to the original bid notice and matched based on more specific information like county (using the package mapview to view the county and the bid notices within the county), part/tract, or area to both OTLS data or State Agency Land data.
Final matches for the three unmatched files are: multabs_matched.csv no_surnam_matched_survey.csv and no_surnam_matched_SAL.csv (2 csv files depending on whether the bid notice was matched with OTLS or State Agency Land) miss_abs_matched_survey.csv and miss_abs_matched_SAL.csv
@yixinsun1216 to flesh this out
Overall goal: we want to be able to place failed auctions on TX State Agency Land (Bureau of Prisons, School for the Blind, Parks and Rec, etc) on a map. There are two pieces to this: the bid notice data that Lydia and @yixinsun1216 have been digitizing and cleaning, and the TX State Agency Land shape file. The TX State Agency Land shape file is stored in
raw_data/StateAgencyLands
.To read in the shape file, you will be using the package
sf
. Here is a tutorial onsf
for reference, and feel free to ask Sunny any questions you have about working with spatial data.Unfortunately, there isn't an "easy" variable in both datasets that will let us directly link them. Instead, you are going to have to do a decent amount of manual checking and linking.
Steps to matching:
In the bid notices, you are going to need to "normalize" the "Survey" column, which will be the closest thing you'll have to a matching variable with the State Agency Land shape file. There are a couple of ways to go about this; below are just a few ideas, but feel free to play around with different matching strategies