Closed jamesshannon closed 4 years ago
@typpo wrote:
@jamesshannon The Placer system looks identical to Yolo County's, which means much of the code can be reused! https://github.com/typpo/ca-property-tax/tree/master/scrapers/yolo
@typpo Thanks. That's helpful.
Where'd the file Yolo_County_Tax_Parcels_Open_Data.csv
come from? I see in the README that there's a way to convert the gdb to geojson (which is used in the parser), but what about the CSV used in the scraper?
I ask because I've investigated the Placer AddressPoints
and Parcels
files a bit more. AddressPoints
has APN and centroid, but I've found that:
Parcels
fileSo it seems better to use the Parcels
file, but the CSV version doesn't have any useful-looking geodata. Both Parcels
and AddressPoints
have an object_id, but they don't seem to match. So I've started looking at ways to get geodata from non-CSV versions of the Parcels
file. It appears I can download the shapefile and use a python package to get the shape and then shapely.geometry
to find the centroid?
I should have clarified - I think the input CSV for Yolo is different from Placer. It's just the tax system that appears to be the same, meaning I think we should be able to copy parts of the web scrape and parse steps (but not the same input file format).
I think that the Parcels
file is the way to go. Although the spreadsheet doesn't have latlng info, if you download it as a shapefile and then convert it using ogr2ogr, it will include latlng info.
After downloading and unzipping the shapefiles, this command:
ogr2ogr -f GeoJSON placer.geojson Parcels.shp
Yields placer.geojson
. Here's an example record from the file:
{ "type": "Feature", "properties": { "OBJECTID": 5, "APN": "471-340-027-000
", "TAX_DESC": "NORMAL OWNERSHIP", "USE_CD_N": "APARTMENTS, 4 UNITS OR MORE
", "STR_SQFT": 1076, "ADR1": "5043 MILLSTONE WAY", "ADR2": "GRANITE BAY CA
95746", "CITY": "GRANITE BAY", "STATE":
"CA", "ZIP": "95746", "STREETNUM": "720", "STREETNAME": "SUNRISE", "STREETTYPE": "AV", "LANDVALUE": 9695, "STRUCTURE": 123898, "Shape__Are": 1051.560546875, "Shape__Len": 155.441730291059 }, "geometry": { "type": "Polygon", "coordinates": [ [ [ -121.272353678201995, 38.735262075994498 ], [ -121.272330313062994, 38.735261915944498 ], [ -121.272330259867999, 38.735267119240397 ], [ -121.272258341183004, 38.7352666546535 ], [ -121.272258460228002, 38.735253323802198 ], [ -121.272278634988993, 38.735253450342597 ], [ -121.272278881337996, 38.735232842775901 ], [ -121.272271623899996, 38.735232797256103 ], [ -121.272271802228005, 38.735194767554198 ], [ -121.272395469203005, 38.735195130536603 ], [ -121.272402235800001, 38.735195145943401 ], [ -121.272402182394998, 38.7352328890397 ], [ -121.272425318672006, 38.735233157550503 ], [ -121.27242491394, 38.735267575904402 ], [ -121.272353624586003, 38.735267320729399 ], [ -121.272353678201995, 38.735262075994498 ] ] ] } },
The list of latlngs defines a bounding box for the property, and we take the centroid. Many of the scrapers/parsers load an ogr-generated geojson file. Here's an example of loading the geojson file and here's an example of finding the centroid.
If you'd like to take this on, I'm happy to answer any other questions and support you! I've uploaded the converted Placer Parcels geojson here so you don't have to go through the trouble of installing ogr yourself: https://drive.google.com/file/d/1t7DpysdWdtJAry1lE4gesjzkuZs4t9n0/view?usp=sharing
Placer CSV file: xxxxx
I'm ready to upload the Placer script, but not sure how to isolate it from the sharedlib changes which I have merged into the branch for development. It'll probably work itself out after the sharelib branch is merged.
Hold off on that file... I'm validating it and seeing some issues.
Ok. File is correct now: https://drive.google.com/file/d/1QU5k5Il6GbzVT4r1NaGPPldgkqJDU495/view?usp=sharing
I created a quick script to validate the files. It does two things to check for programming errors and GIGO errors:
Added! Sorry for the delay, the past week has been...distracting.
The validation script sounds very useful, I often mess things up the first time by flipping lng/lat
@jamesshannon How would you like to be credited on the site? Name + link to twitter or personal website?
Initial investigation for Placer:
GIS data overview page
Parcel information is in AddressPoints.csv. You'd think you'd want Parcels.csv but AddressPoints has a lat/lng centroid point while Parcels has some fields (like
Shape__Area
) which don't appear directly useful.Tax info can be found at the URL:
https://common3.mptsweb.com/MBC/placer/tax/main/__APN__/2020/0000
where__APN__
is the APN without-
's. E.g.https://common3.mptsweb.com/MBC/placer/tax/main/466120044000/2020/0000
. Tax amount is found in the Totals - 1st and 2nd Installments section.Originally posted by @jamesshannon in https://github.com/typpo/ca-property-tax/issues/1#issuecomment-720307958