18F / data-federation-project

A project focused on tools and best practices to supported federated data collection efforts
28 stars 9 forks source link

Interview: Tim Wisniewski (PHL CDO) #20

Closed anthonygarvan closed 6 years ago

anthonygarvan commented 6 years ago

Introductory comments

What kinds of data does the city collect, and from whom?

Sometimes there's a big contrast in the day-to-day reality [of data standards] with realities on the ground: mainframes ranging to modern web apps. One example: really great if you could type in a property owners name and what addresses / contact info is. Then you look at property assessment system, and it's a mainframe where owner field has 16 characters. Sometimes truncated, sometimes spills into multiple fields. Sometimes just getting data out of a mainframe into RDB can bring you miles forward. "many departments have been sharing data in an ad-hoc way for quite some time" Open data brings in more transparency and technology to.

What was impetus or driving force for this efforts to collect data: policy, user needs, etc (perhaps after the first question)

In collecting data, what were the biggest challenges, and what went smoothly?

for keeping metadata.philly.gov up to date: trained them on how to use it, got feedback on what was confusing, added tips & notes. Were using arccatalog before, it's public, works in the browser. Still have to nudge & remind, but there's a process in place to check that the metadata is up to date before published. "Departments come to us pretty frequently asking us to publish data" "A lot of support for it" "sometimes it's publishing to the public, sometimes it's just sharing with other departments" Sometimes department staff fill out, sometimes they fill it out for them. Why publish to public? A lot of departments that do amazing work. Usually publishing data accompanied by visualizations & blog post / press release. Lots of times they get requests from the media / city council, reduces burden in that way. "The easier you can make it to comply, the more people will comply" Generally people "get it" "It's more that everybody's got a million things going on" "The easier that you can make it... like making the UI streamlined for common workflows" "employing user experience design, where the users are data publishers."

What tools and technologies do you use for this effort?

opendataphilly - stored in cartodb . ETL for city databases to carto: python scripts, scheduled in in-house tool called taskflow (scheduler / background task runner). carto provides download links. CKAN hosts the links. Application builder Knack for metadata entry. SaaS product (like MS Access in the cloud).

Why did you choose this architecture or process. Were others tried, etc (after the "data aggregation/distribution" question)

"A result of trying it other ways and learning what the pain points were" "Constantly trying to consolidate, and simultaneously improve that infrastructure" used socrata before carto, published to github before that. Vizwit made to provide some visualization capability th. "Largely because we have been able to embrace open source software and publish open source software, it has given us a lot of flexiblity."

What are the political and organizational dynamics of collecting this data?

Who were the relevant stakeholders for this project, how were they identified and convened?

Is there anyone else I should speak with to better understand X?