18F / data-federation-project

A project focused on tools and best practices to supported federated data collection efforts
28 stars 9 forks source link

Tyler Kleykamp Interview #13

Closed anthonygarvan closed 6 years ago

anthonygarvan commented 6 years ago

Introductory comments

What experience do you have that might be relevant to this effort? (e.g., work with open data standards, participation in gov data collection across organizational boundaries, etc.)

"probably a better way to do this, don't know what that better way is" "pretty interested in this flow of data from the state to the federal government."

previous: recovery act, federal agencies pump more to state programs.

Data collected from municipalities:

challenges / what went well: real estate report, aggregated from municipality. getting data is easy. 10 categories: residential, commercial, industrial, vacant, etc. One town may consider apartment building as residential, another considers them apartments. Still have paper based reporting systems - small towns don't have full time person working there.

uniform crime reports example- some departments still using triplicate carbon paper.

What was impetus or driving force for this effort: policy, user needs, etc (perhaps after the first question)

90% of the time, state or federal law. Mandate reporting of data to states from municipalities. Some cities struggling, some effort to monitor fiscal health of cities. Requested data from towns, got some resistance. Implemented law that would allow us to get timely information. Big problem: reporting only at end of fiscal year, then data is not timely enough to be useful.

In building X, what were the biggest challenges, and what went smoothly?

smoothly: financial incentives or support work. example: provided grants to get municipalities to get off quickbooks, got great compliance rates. Another example: grants to develop parcel data, if you take the money you must use our data and give us the data back.

for reporting data up to feds: "as far as I can tell it's a fairly smooth process" . Usually federal software provided & funded. For example - state drinking water. Use fed provided software as database. But it's "trapped" in federal systems. Hypothesis: if we had a better way to make the data more readily available as open data, or access-controlled tables on web.

What tools and technologies do you use for this effort?

"it's all over the place" - you can email, there's a "portal" to upload. More sophisticated systems where you upload CSV / excel. Standardization - either spelled out in statute or (in more successful models) requires agency to come up with standard, usually focused on ontology, usually done through considerable stakeholder engagement. Iterative process. Draft / feedback / etc.

QA / QC after aggregation. Compare to year before, is something way different? Lots of knowledgeable people, things jump out at them. Some work before it's suitable for analysis. Vast majority is for mandated annual report. More recently goes into dashboard etc online.

Why did you choose this architecture or process. Were others tried, etc (after the "data aggregation/distribution" question)

What are the political and organizational dynamics of collecting this data?

broadly, a lot of the people that are ultimately responsible for submitted / collecting data are not technologists, not thinking about other uses for the data might be out there, getting people to think outside the bounds of the exact report they're creating is challenging. Often there's a mandate, but no carrots or sticks. A lot of reluctant participation. Often done begrudgingly.

Who were the relevant stakeholders for this project, how were they identified and convened?

Is there anyone else I should speak with to better understand X?

What efforts are you aware of that fit this category?

Do you have contacts in those efforts who we could reach out to?

What do you think are the primary challenges of these types of efforts?

Would an open source toolkit help?

"No shortage of tech based tools where you can load a spreadsheet in this column aligns with this column in this database" (e.g., basic ETL). Perhaps process for uploading & running basic validations. If some of the data were just more available, that might help.

tkleykamp commented 6 years ago

Other than misspelling my last name (a frequent problem of mine - it's Kleykamp) looks like a pretty accurate summary of our conversation.

anthonygarvan commented 6 years ago

Major facepalm, so sorry!