What is the data federation effort all about? What am I looking to get out of it?
This is a collaborative research project with GSA's Office of Products & Platforms and 18F. The goal is to build a toolkit / playbook for undertaking intra-governmental data collection / aggregation projects, such as data.gov, code.gov, and NIEM, where data is collected from entities over which you do not have direct authority. We call these federated data efforts. Or goal is to find out what works, what doesn't, and what tools are appropriate for what circumstances, in order to accelerate similar efforts in the future.
Notes & Report will be public
Any questions before we get started?
What experience do you have that might be relevant to this effort? (e.g., work with open data standards, participation in gov data collection across organizational boundaries, etc.)
"A lot" - worked in NYC gov for a number of years, participated in open311. Also one of early partners in "lives" standard from yelp. Also BLDS building and land development specification. Also library of standard at datastandards.directory. Been involved in direct development in standards, also involved with city gov in adoption.
What efforts are you aware of that fit this category?
60 in datastandards.directory. OpenReferral - very federated.
Do you have contacts in those efforts who we could reach out to?
greg bloom for open referral. pls remind him.
What do you think are the primary challenges of these types of efforts?
"I think there's a few... one is that there is no central authority for those types of data. Tons of variety in state and local government, both in terms of data structure and organizational structure.
Restaurant inspections- federal gov makes recommendations. (through FDA). But no guidance on how much weight to apply to each thing. Cities/ counties weight things in whatever way they want, along with stuff on their own etc. How each inspection is done varies widely by jurisdiction.
"Organizational ego comes into this as well" "people will do what it takes the data to get it into the format, so we're not going to align to a common set of standards." little bit of ego, some territorialism, some varying business practices by jurisdiction.
What incentivizes people or overcomes challenges?
When there's a clear use case that has a very clear case of adoption. Example: Google transit. Clear use case for broad potential for adoption. Fewer of those clear use cases in other domains. If there is a clear value add business case for both the provider and the consumer, it's going to happen.
Another one is benchmarking. Big appetite for small cities. Federated standards could be successful there.
what about tools & technologies?
mailing lists, discusssion boards, slack, decent websites are good places for convening the people, biweekly/ monthly calls to convene people. On technology side "I don't thing there's anything that great". Concept of data packages - getting data / meta data together in zip files. As soon as you get into taxonomies, it varies quite widely. NIEM.
what are some organizational challenges faced by these efforts?
"Both political and egotistical" "Fear of failure or risk of exposure can also be a barrier" . e.g., people might have delinquent policies / lax standards for inspections.
Data ingestions is a challenge- riskier than publishing. Concerns about security, abuse, accuracy. Hesitant to ingest data from non-government sources. Both gov and non-gov data pose problem / liability for ingestion. Long term data exchange relationships develop legacy issues over time, so ingestion of new data a problem for that as well.
"There are a lot of barriers, both technical and human, to doing this type of work, but standards in a particular are the key to getting data sharing to scale." "I do think there are these needs for inter-organizational managing bodies to coordinate this stuff" "need an independent convener, otherwise it can stall out or lose political momentum"
What's the best home for efforts like this?
e.g., lives spec by yelp dominated by private sector and doesn't work for a lot of cities.
e.g., transit spec- successful ownership by private sector. For having inter-organizational conversations, 3rd party very helpful as mediator.
Introductory comments
What experience do you have that might be relevant to this effort? (e.g., work with open data standards, participation in gov data collection across organizational boundaries, etc.)
"A lot" - worked in NYC gov for a number of years, participated in open311. Also one of early partners in "lives" standard from yelp. Also BLDS building and land development specification. Also library of standard at datastandards.directory. Been involved in direct development in standards, also involved with city gov in adoption.
What efforts are you aware of that fit this category?
60 in datastandards.directory. OpenReferral - very federated.
Do you have contacts in those efforts who we could reach out to?
greg bloom for open referral. pls remind him.
What do you think are the primary challenges of these types of efforts?
"I think there's a few... one is that there is no central authority for those types of data. Tons of variety in state and local government, both in terms of data structure and organizational structure.
Restaurant inspections- federal gov makes recommendations. (through FDA). But no guidance on how much weight to apply to each thing. Cities/ counties weight things in whatever way they want, along with stuff on their own etc. How each inspection is done varies widely by jurisdiction.
"Organizational ego comes into this as well" "people will do what it takes the data to get it into the format, so we're not going to align to a common set of standards." little bit of ego, some territorialism, some varying business practices by jurisdiction.
What incentivizes people or overcomes challenges?
When there's a clear use case that has a very clear case of adoption. Example: Google transit. Clear use case for broad potential for adoption. Fewer of those clear use cases in other domains. If there is a clear value add business case for both the provider and the consumer, it's going to happen.
Another one is benchmarking. Big appetite for small cities. Federated standards could be successful there.
what about tools & technologies?
what are some organizational challenges faced by these efforts?
"Both political and egotistical" "Fear of failure or risk of exposure can also be a barrier" . e.g., people might have delinquent policies / lax standards for inspections.
Data ingestions is a challenge- riskier than publishing. Concerns about security, abuse, accuracy. Hesitant to ingest data from non-government sources. Both gov and non-gov data pose problem / liability for ingestion. Long term data exchange relationships develop legacy issues over time, so ingestion of new data a problem for that as well.
"There are a lot of barriers, both technical and human, to doing this type of work, but standards in a particular are the key to getting data sharing to scale." "I do think there are these needs for inter-organizational managing bodies to coordinate this stuff" "need an independent convener, otherwise it can stall out or lose political momentum"
What's the best home for efforts like this?
e.g., lives spec by yelp dominated by private sector and doesn't work for a lot of cities. e.g., transit spec- successful ownership by private sector. For having inter-organizational conversations, 3rd party very helpful as mediator.