Open typpo opened 3 years ago
I found some data for San Diego county.
Parcel data at: https://hub.arcgis.com/datasets/SANDAG::parcels-4 In the zip file is a .dbf file (convert to a csv with dbfdump). The location is given in x and y coordinates that need some unclear conversion to lat/lon.
Tax data query: https://www.sdttc.com/content/ttc/en/tax-collection/prior-year-tax-records.html?fiscal_year=2019-07-01%7C2020-06-30&q={APN}&x=0&y=0
Hey @edre, thanks for the pointer. I believe the coordinate system is most likely California State Plane Zone 6 (reference) but I haven't tested it on the San Diego data.
I should have the coordinate conversion done soon because Contra Costa and several other counties also need it. Will let you know once I figure that out (at least latlng conversion can wait til parsing, it is not a blocker for scraping).
An update on new integrations
Added:
In progress:
Update on county integrations
Added:
In progress:
Napa County (for anyone interested in coding):
Good news @jakebayless - Napa was added by @miloconway in #11!
Other progress: SLO was added by @kevbuchanan, San Diego by @swingley
In progress: North County San Diego, Orange County (almost done)
Excellent. Ok. Let's dive into some more rural and ag counties... For anyone interested in coding this:
Butte County, also recently impacted by big fires, perhaps useful and timely to have tax info exposed. Butte tax records are the same web app as Sonoma County (reference that for code). APN and year is handy in the URL:
https://common2.mptsweb.com/MBC/butte/tax/main/053022019000/2020/0000
...and MAN! it took some digging, but here is the AGOL Butte County Parcels layer: https://services.arcgis.com/3t3QfTXFRFX44zo8/arcgis/rest/services/Butte_County_Parcels/FeatureServer
Currently in the process of looking at Fresno (I'm picking by population), can take a look at that one next unless someone else would like to.
Is there a master list of the counties added/needed so I/we can be methodical about researching the relevant endpoints? Maybe a sheet in drive or something we can share edits?
Initial investigation for Placer:
Parcel information is in AddressPoints.csv. You'd think you'd want Parcels.csv but AddressPoints has a lat/lng centroid point while Parcels has some fields (like Shape__Area
) which don't appear directly useful.
Tax info can be found at the URL: https://common3.mptsweb.com/MBC/placer/tax/main/__APN__/2020/0000
where __APN__
is the APN without -
's. E.g. https://common3.mptsweb.com/MBC/placer/tax/main/466120044000/2020/0000
. Tax amount is found in the Totals - 1st and 2nd Installments section.
@jamesshannon The Placer system looks identical to Yolo County's, which means much of the code can be reused! https://github.com/typpo/ca-property-tax/tree/master/scrapers/yolo
I'm splitting discussion of Placer County off to #17
Data recently added:
In progress:
I've created a spreadsheet that summarizes the status of all California counties: link. The good news is that with 19 out of 58 counties, we are now covering 75% of the state's population.
I'd recommend that the sheet have a column to describe the Scraper data. According to https://www.mptsweb.net/ they support 35 counties. There are probably some other shared systems. If you hadn't pointed me to Yolo I wouldn't have known that I could reuse their code. If the spreadsheet had mentioned that Yolo scraped from mptsweb.com then that would have probably helped?
I'm running the parser on Placer right now and should have a CSV in a few minutes. I notice that -- despite them being < 10 mb -- you don't have the CSVs checked into git. How shall I deliver to you?
Also, I submitted a draft PR for my shared library. I have one or two more changes to submit but it's basically done. It handles CSVs and shapefiles with only a few pieces of configuration. My placer scraper script is basically this:
DATA_DIR = os.path.join(os.path.dirname(os.path.realpath(__file__)), 'data')
PARCELS_GEN = parcels.ParcelsShapefile('PL', 'APN', 'ADR1',
parcels.centroidfn_from_shape(),
os.path.join(DATA_DIR, 'Parcels.shp'))
PARCELS_GEN.valid_apn_pattern = r'^\d{3}-\d{3}-\d{3}-\d{3}$'
def scrape():
scraper = scrapers.Scraper(PARCELS_GEN, DATA_DIR,
'https://common3.mptsweb.com/MBC/placer/tax/main/{apn_clean}/2020/0000')
scraper.request_unsuccessful_string = '<title>ERROR</title>'
scraper.scrape()
def parse():
parser = parsers.ParserMegabyte(PARCELS_GEN, DATA_DIR)
parser.parse()
@jamesshannon That's awesome. Thank you for your work on the generic scraper/parser, I've skimmed it but will take a more in-depth look as soon as I can. I've added a Notes column to the sheet.
The mptsweb site includes this handy graphic, which makes me a bit less worried about adding all those tiny rural counties:
We've been sending the processed CSVs by Drive/Dropbox. The data is small for individual counties but in aggregate it's a couple hundred mb gzipped, which is too large for git/github.
FYI:
Kern County Parcels File: https://geodat-kernco.opendata.arcgis.com/datasets/abe562bb259144a0a95e6b9899fd00b8_0 APN-based tax search: http://recorder.co.kern.ca.us/propertydetails.php?srctext=001020015&srctype=apn
The parcels file description says that:
Tax Roll Data is available in separate database tables, which can be joined to the feature class using the APN9 field as the SQL join key.
But I can't find this file.
Also, according to this page 2020 parcels is the most recent and they sell their GIS data, and the shapefile I linked to is 2019, so maybe it's an older public domain file?
Nice! was looking at this, thought the tax value was only available in a captcha-backed site. This looks much more viable -Andy
On Wed, Nov 4, 2020 at 10:06 AM James Shannon notifications@github.com wrote:
FYI:
Kern County Parcels File: https://geodat-kernco.opendata.arcgis.com/datasets/abe562bb259144a0a95e6b9899fd00b8_0 APN-based tax search: http://recorder.co.kern.ca.us/propertydetails.php?srctext=001020015&srctype=apn
The parcels file description says that:
Tax Roll Data is available in separate database tables, which can be joined to the feature class using the APN9 field as the SQL join key.
But I can't find this file.
Also, according to this page http://assessor.co.kern.ca.us/gis_data.php 2020 parcels is the most recent and they sell their GIS data, and the shapefile I linked to is 2019, so maybe it's an older public domain file?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/typpo/ca-property-tax/issues/1#issuecomment-721888425, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACOPIKJYUHPAV4ZOKGJW2LSOGJ2PANCNFSM4SQEASKQ .
Just added Placer & Kern
I'm trying to fill in some basic sleuth gaps in my spare moments for others to work with. Here are the URLs for Mendocino Parcels as well as Tax records: Parcel layer: https://gis.mendocinocounty.org/server/rest/services/Parcels_sde_pub/MapServer/6 Tax records (APN in the URL: https://www.co.mendocino.ca.us/tax/cgi-bin/pTaxFR2.pl?apn=02710109&street=&situsAddr2=
See How to add a county in the README.
Up-to-date spreadsheet with status of each county: https://docs.google.com/spreadsheets/d/1Po5WNrADfJhO87xdHXWqRZDPydDAOH7vbppzsICLVXg/edit#gid=0