Open caged opened 9 years ago
Yes, I feel your pain. The info coming out of the City's current permitting DB leaves much to be desired. Good news is we're moving to a new system. Bad news is that it will be 2018 until it's fully up-and-running.
In the interim, a couple of options. If you only care about permits that create new residential units, we have all of those mapped going back to 1996 (current through the end of July). Happy to pass along. I can give you a point shapefile.
If you want all permits, we have an x/y for all those that reconcile to an existing taxlot via the taxlot's State ID. This is the information you see in PortlandMaps. Unfortunately, any that don't reconcile -- because the State ID no longer exists, it was typed incorrectly (I know), etc. -- are lost. Could be as many as 20% of all permits. We are currently building out a state ID history table will be used to better match permits to taxlots. This should have a much higher success rate than the current method. Hopefully we'll use this method to map all permits, develop a workflow for keeping it current, and then put the data out there for people to use sometime in the next 3-6 months. Going to involve some manual cleanup to get to the initial dataset. Working on demolitions first, as there's a lot of interest and not much info available.
Bottom line, address is a dead-end with the current system. The taxlot State ID is the best way to tie permits to a known location, in this case the X/Y centroid of the lot.
Hope that helps!
@ksmpdx Thanks for the quick and detailed response. You're pretty much my favorite person right now! :)
In the interim, a couple of options. If you only care about permits that create new residential units, we have all of those mapped going back to 1996 (current through the end of July). Happy to pass along. I can give you a point shapefile.
This would be perfect given my main interest is residential construction.
An update on my original post...after stripping " PORTLAND" from the permit addresses, I was able to match a little over 539k permits to official addresses. Some permits appear to be city-wide or large areas, so they don't contain a full address. I suspect that's where a bulk of the missing permits fall. Happy to provide the results if you think it'll be useful in any way.
Thanks again. If I haven't mentioned it before, the ability to have this type of dialog with you and others on the other side of the equation is greatly appreciated.
@Caged Sorry for the slow follow-up after the initial reply... just had a chance to remote in and track down the data. Posted here:
https://dl.dropboxusercontent.com/u/2836837/residential_permits_pdx_150907.zip
It's been updated through 8/31/15. Metadata does a reasonable job of explaining what's what, but let me know if you have questions. Farther back you go, less accurate the data is (especially true pre-2000).
Kudos on the address match! I'm impressed. We may want to compare work when we finish the taxlot reconcile. Should not be this hard. :-)
Agree, the dialog is great and beneficial to all. Thanks, and keep it coming.
@ksmpdx Thanks for the data. It looks to be in pretty good shape sans missing issuedate
for 2007 data. For whatever reason, that appears to be the only year missing (and one in 1998).
construction=# select year, count(*) from residential_permits where issuedate is null group by 1;
year | count
------+-------
1998 | 1
2007 | 188
Weird. Maybe something went wrong when I exported. Here's the summary I'm pulling from the data, by year/# of permits/# of units built:
YEAR,Count_Permits,Count_Units 1995,1341,2069 1996,1414,2739 1997,1563,3903 1998,1575,4202 1999,1142,2093 2000,1311,3459 2001,1337,2321 2002,1479,3365 2003,1591,5084 2004,1476,3194 2005,1633,3994 2006,1509,3503 2007,1514,4395 2008,853,3312 2009,509,1040 2010,534,1174 2011,598,1499 2012,839,2682 2013,1006,3951 2014,1093,5446 2015,853,3738
Let me know if you continue to see a discrepancy.... I can re-post. Also, forgot to mention "Is_ADU". We manually flag all Accessory Dwelling Unit (ADU) permits using some standardized search criteria, since our permitting DB does not consistently track. Could be useful if ADU development is something you are interested in.
Check out the average number of new units per permit. Interesting -- if not surprising -- trend.
@ksmpdx I get the same results as you if I group on the year
columns, however 2007 data is missing the issuedate
field.
year | permits | units |
---|---|---|
1995 | 1341 | 2069 |
1996 | 1414 | 2739 |
1997 | 1563 | 3903 |
1998 | 1575 | 4202 |
1999 | 1136 | 2093 |
2000 | 1310 | 3459 |
2001 | 1336 | 2321 |
2002 | 1479 | 3365 |
2003 | 1591 | 5084 |
2004 | 1474 | 3194 |
2005 | 1633 | 3994 |
2006 | 1507 | 3503 |
2007 | 1514 | 4395 |
2008 | 853 | 3312 |
2009 | 509 | 1040 |
2010 | 534 | 1174 |
2011 | 597 | 1499 |
2012 | 838 | 2682 |
2013 | 1004 | 3951 |
2014 | 1075 | 5446 |
2015 | 853 | 3738 |
Check out the average number of new units per permit. Interesting -- if not surprising -- trend.
It's interesting to look at the 90th percentile for units.
year | percentile_90 |
---|---|
2015 | 78 |
2014 | 112 |
2013 | 76 |
2012 | 64 |
2011 | 50 |
2010 | 47 |
2009 | 89 |
2008 | 70 |
2007 | 31 |
2006 | 21 |
2005 | 10 |
2004 | 12 |
2003 | 16 |
2002 | 15 |
2001 | 12 |
2000 | 39 |
1999 | 6 |
1998 | 17 |
1997 | 22 |
1996 | 12 |
1995 | 12 |
Units > 1
year | median_units |
---|---|
2015 | 3.00 |
2014 | 6.00 |
2013 | 7.00 |
2012 | 10.00 |
2011 | 4.00 |
2010 | 2.00 |
2009 | 2.00 |
2008 | 2.00 |
2007 | 2.00 |
2006 | 2.00 |
2005 | 3.00 |
2004 | 4.00 |
2003 | 3.00 |
2002 | 4.00 |
2001 | 4.00 |
2000 | 4.00 |
1999 | 2.00 |
1998 | 4.00 |
1997 | 3.00 |
1996 | 4.00 |
1995 | 2.00 |
Lot's of interesting stuff in here.
@caged Great to meet you last week. Let me know if you have any more questions about the permit data or you need anything else.
I'll also hijack this issue to put out this shameless plug: http://agency.governmentjobs.com/portlandor/default.cfm?action=viewJob&jobID=1234230
If you, @maxogden or @ngoldman know anyone who might be interested, please send it their way. Thanks!
@ksmpdx I'll pass the info around. The posting is for a job working with Rick Nixon's team, right? Is that BTS?
Correct! This is Rick's team, in BTS. Thanks for passing around.
@ksmpdx Is there a way to get the
address_id
, and potentially thex
,y
from the city's address data added to the building permits data? I'd like to see the geo location of the permits, but the data currently doesn't include this information. I've been trying to sync these up for a better part of a day by comparing the full addresses to each other, however I've run into multiple issues.The first challenge was a set of small issues that I bumped into before actually trying to compare the addresses.
snake_case_names
." Case Number"
is different than"Case Number"
.The next challenge in trying to sync these up manually is that there are 615,752 permits in the permit dataset and 416,424 address points. For each permit, we need to potentially scan every address to find one that matches, which is
615752 * 416424
. That's a potential 256,413,910,848 iterations in a worst-case scenario.There are ways to reduce the number of comparisons that have to be made, such as caching the address for the permit
permit_case_number
, since there are multiple entries for a single permit case number as the particular property has to go through multiple inspections.Now the problem moves on to comparing the actual addresses with each other. The major problem here is that it's a very finicky process and using a tool like csvjoin requires an exact match. I'm currently in this part of the process and I've hit a couple of snags pretty quickly.
There are double spaces in the full address strings. This can be difficult to spot and doesn't result in an exact match. It's easy enough to collapse spaces for the entire file. After doing this I ended up getting matches for roughly 40k permits, or about 9% of addresses and 6% of permits.
After getting the first batch of matches, I noticed that the reason more addresses aren't matching is because most of the permit data addresses end with " PORTLAND," however the address data file doesn't follow this pattern.
And, that's where I left things. I kinda felt like I went pretty far down the rabbit hole already, and maybe I should stop to see if I could get someone at the city to add this information if it's possible.
Thanks!