sketch-city / project-ideas

Running list of all project ideas - pick one and run with it!
http://sketch-city.github.io/project-ideas/
89 stars 7 forks source link

Where are you most likely to get a parking ticket? #43

Open fileunderjeff opened 8 years ago

fileunderjeff commented 8 years ago

Looking at parking citation data, create a map of where tickets occur. You could use this map to optimize the City of Houston Parking Enforcement operation, or simply just to avoid certain areas at specific times.

jpoles1 commented 8 years ago

This is a very interesting idea! I think I'm going to look into it!

jpoles1 commented 8 years ago

I've started looking through the data, and decided to focus my initial analysis on which auto manufacturers tend to get the most tickets. I began by just looking at the raw ticket numbers, but I have also used market share to estimate the rate of ticketing.

Unfortunately, data.houstontx.gov does not have the proper links to the most recent ticketing data, so the tickets being examined here are only those given prior to 6/30/2012. I'm not sure who to reach out to in order to get the proper data.

make_comparison

fileunderjeff commented 8 years ago

What a cool look at the data! How interesting.

Are the datasets here incorrect? http://data.houstontx.gov/dataset/city-of-houston-parking-citations

If they are correct, we should have tickets through May 2015. I will see if we can refresh the portal with more recent ticket activity prior to the hackathon!

fileunderjeff commented 8 years ago

Also, I wonder if you control for scofflaws (i.e. people with 5+ tickets a year), how that adjusted ticket frequency might change?

jpoles1 commented 8 years ago

Thanks Jeff! If you try to download the two datasets listed there, you'll find that they link you to the same file (the 2012 data).

Also if anyone else is interested in working on this project with me, drop a line in the comments! Unfortunately, I will not be in town for the hackathon :(

fileunderjeff commented 8 years ago

@jpoles1 I sent an inquiry to the open data people at the city to refresh this data. Hopefully we'll get that done quickly!

frank0051 commented 8 years ago

@jpoles1 : we'll work on getting something refreshed for you next week. I apologize for the inconvenience. We're also going to try to toss in lat and long (no promises though) in time for the Hackathon in case there is any interest in mapping.

On scofflaws, since we don't release any identifying information on offenders that may be a little difficult to do, but you could try the Entity_ID field. That said, the system isn't the smartest in the world so it isn't unusual for the a person or business to have multiple entity IDs.

fileunderjeff commented 8 years ago

@frank0051 lat/long is cool, but i think the data already has block and street (e.g. 1500 block, scott street). That might be enough for mapping purposes. Current data is more of a pressing need.

fileunderjeff commented 8 years ago

@frank0051 @jpoles1 generally, you can track scofflaws by license plate #. It looks like you guys scrub that info before releasing. Any reason why? You can publicly query citations by license plate # through T2's system.

frank0051 commented 8 years ago

@jpoles1 : I fixed the URL going to the wrong place for the data through 3/31/2015 (as of 5/1/2015) so you can at least get data through last year now.

frank0051 commented 8 years ago

@fileunderjeff : Our policy on Open Data allows for sensitive information to be removed prior to release. The Parking Management Division classifies license plates as identifiable information and was uncomfortable with releasing it. We checked with our Legal Department and they deem driver's license numbers, license plates numbers, and VINs as sensitive as well and suggested that the plate numbers shouldn't be proactively released due to Texas Government Code 522.130. That said, somehow insurance companies, law firms, traffic accident schools, and car companies trying to sell extended warranties all manage to get the information after a citation is written so I suspect it's available under the Texas Public Information Act but I don't know for sure.

The cities of Austin, Dallas and Forth Worth along with Bexar County don't release parking citation information on open data (or really any citation information). Forth Worth does release car accident info without license plates. San Antonio and El Paso doesn't seem to have an open data program from what I can tell. If you would like to explore further, please provide use feedback at https://cityforms.houstontx.gov/component/rsform/form/81-open-data-portal-ideas-and-feedback

jpoles1 commented 8 years ago

@fileunderjeff @frank0051 Thanks so much for your help with this project!

Lat/Long data would actually be very helpful. I was considering this problem last night, and with my current knowledge of GIS analysis I think I would need that information for mapping. If coords are not provided in the dataset, I think the only other way to get them is by reverse-geocoding which would be slow (and potentially not-free given the size of this dataset).

jpoles1 commented 8 years ago

@frank0051 Analysis of car accident data sounds like it could also yield some useful findings. Is such data available here in Houston (I could not find it with a quick search of the data portal)?

fileunderjeff commented 8 years ago

Thanks @frank0051. Here is what I wrote:

"I would like to know more about parking scofflaw data, however, the citation data set on the open data portal is incomplete and inconsistent. The best solution would be to release license plate information for each citation. If that is not possible, I would suggest getting a citation count by license plate in T2 and adding it to your citation report."

I am very familiar with the parking system in question, and the report addition should be very simple and very helpful.

jpoles1 commented 8 years ago

In addition, I just glanced at it, but this parking data might also be interesting to examine for this project.

fileunderjeff commented 8 years ago

@frank0051 thank you for the explanation. I really appreciate it. For posterity, here are a few interesting links on the subject and why I believe the City and State positions are wrong:

https://www.techdirt.com/articles/20130722/11435123886/license-plate-data-isnt-personally-identifiable-its-too-private-public-to-access.shtml

http://jalopnik.com/why-do-we-always-blur-license-plates-on-the-internet-1691298199

http://motherboard.vice.com/blog/why-wont-cops-share-the-license-plate-data-they-collect

and so on.

jpoles1 commented 8 years ago

@frank0051 Thanks for getting the updated dataset available so quickly, I'm going to take a look now!

jpoles1 commented 8 years ago

Here's a look at the same analysis I performed earlier on the new data (post-2012): parking tickets by make

jpoles1 commented 8 years ago

I've put my code in this repo if anyone would like to take a look. In addition, if you're interested in working with me on this project, drop a line below, and we'll find a way to get you the SQLite database I have been using (it's too large to upload to github).

frank0051 commented 8 years ago

@jpoles1, @fileunderjeff updated the data and broke into smaller files. Added a lat and long where we had it in the system. Pretty sparse at this point. http://data.ohouston.org/dataset/city-of-houston-parking-citations

jpoles1 commented 8 years ago

@frank0051 Why is it that only certain records have lat/long data associated with them?

frank0051 commented 8 years ago

It's in the metadata. The handhelds have only been transmitting since mid 2014 and with some frequency the handhelds fail to record the coordinates. Feel free to use that feedback form mentioned earlier to suggest we geocode all entries for more reliable results.

Sent from Yahoo Mail on Android

On Fri, May 13, 2016 at 6:04 PM, Jordan Polesnotifications@github.com wrote:
@frank0051 Why is it that only certain records have lat/long data associated with them?

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub

jpoles1 commented 8 years ago

I've managed to incorporate the lat/long data into the report with some maps! They're not perfect, but they add some interesting insights to the report.

This project was also picked up by a Houston real-estate blog!


Ticket map

frank0051 commented 8 years ago

@jpoles1 , @fileunderjeff : we're a go for adding a column into the dataset that tells you have many citations have been issued for the plate number associated with the ticket. We cannot share out the plate number through in our current state.

Do you have suggestions on how you would like this count column to work? Would you like it to be a count of:

fileunderjeff commented 8 years ago

@frank0051 awesome news! All three would be amazing, and we could derive useful insights from each one. But the first two would be the most interesting to me, with "all citations issued to that plate" being the top priority.

fileunderjeff commented 8 years ago

@frank0051 is it possible to request citation report data as a separate dataset? to be able to reconcile citations issued with citations paid and confirm/seek to improve the city's collection rate?

frank0051 commented 8 years ago

@fileunderjeff : not sure I'm following the logic re: citation report data as a separate data set? Our current citations dataset includes all of the citations and the status as to whether they're paid. We don't include the a last payment date as we don't extract all of the financial history into our reporting environment as it would be too large compared to the time it would take and the value it would add.

jpoles1 commented 8 years ago

@frank0051 If we cannot get license plate data, is there a possibility that we can instead get a non-personally-identifying, unique ID # (based on the plate) for each row?

I agree with @fileunderjeff that "all citations issued to that plate" would be the most useful piece of information for identifying the "scofflaws"

frank0051 commented 8 years ago

@jpoles1 : http://performance.houstontx.gov/opendata/parking-scofflaw-data

That's where things stand on providing anything that would allow you to tie back to a plate per what we provided back to Jeff. I would be happy to assist you through the TPIA process to see if Legal is willing to offer additional clarification or reconsider. Unfortunately, as the City does not have an Enterprise Data Officer currently appointed, we currently do not have a policy-based conduit under which to facilitate a formal review on the validity of the security concern given existing resources without a formalized request. Given this, to better ensure your request is addressed, I need to refer you to the TPIA process.

frank0051 commented 8 years ago

@jpoles1 , @fileunderjeff : I designed the logic to count a) the outstanding tickets at the point the extract is run and b) all citations issued in our system to that license plate at the time the extract is run. When I started thinking through how it could be used I got rather frustrated at its limited potential for analysis purposes. Even though we do not have an Enterprise Data Officer at this time, I went ahead and brought the issue back to Legal for a possible work-around. As such; we will be releasing the data with an anonymized plate number. We will try to get it out there this afternoon. We will also be making a minor change as to how Officer is shared.

fileunderjeff commented 8 years ago

@frank0051 that is amazing. thank you!

frank0051 commented 8 years ago

@jpoles1 , @fileunderjeff : The parking citation dataset has been updated to include an anonymized plate number and citation dollar amounts. At an undetermined point in the future, the City will work to automate the release of this dataset to a monthly release schedule. Please consult the metadata file regularly as the way the anonymized license plate functions may change.

fileunderjeff commented 7 years ago

@jpoles1 any interest in refreshing your study with current data?

jpoles1 commented 7 years ago

Sure, I could get behind that. Would also be great to come up with a list of any possible concrete goals or applications for this data/project.

fileunderjeff commented 7 years ago

@jpoles1 I have emailed some folks inside the city who might have some good ideas where we can go with this. I've invited them to chime in on this thread, so hopefully we'll get a reboot going soon!

jpoles1 commented 7 years ago

Excellent! Will be interested to hear some perspectives on what direction we should take this analysis.