project-open-data / project-open-data.github.io

Open Data Policy — Managing Information as an Asset
https://project-open-data.cio.gov/
Other
1.34k stars 583 forks source link

Clarify the schema requirements or provide an alternate version for non-federal sources #247

Closed philipashlock closed 9 years ago

philipashlock commented 10 years ago

The data.json schema is already based on the DCAT standard, but clearly its particular implementation of it is intended to serve as a standard for not only the federal government but others as well. For this to happen, there needs to be a delineation in the schema for which fields and requirements are specific to federal agencies.

For resources like data.gov this is important to help them federate data sources from state and local government based on a compatible data.json schema used by those entities.

JeanneHolm commented 10 years ago

+1

georgethomas commented 10 years ago

from http://www.w3.org/TR/vocab-dcat/;

"A DCAT profile is a specification for data catalogs that adds additional constraints to DCAT. A data catalog that conforms to the profile also conforms to DCAT. Additional constraints in a profile may include:

I believe some profile examples of other governments using DCAT can be found here;

http://www.w3.org/2011/gld/wiki/DCAT_Implementations

On Mon, Jan 13, 2014 at 12:55 PM, Jeanne Holm notifications@github.comwrote:

+1

— Reply to this email directly or view it on GitHubhttps://github.com/project-open-data/project-open-data.github.io/issues/247#issuecomment-32193382 .

mhogeweg commented 10 years ago

would a state/local implementation be a profile of DCAT proper or a profile 'on top of' the Data.gov profile? would state/local DCAT have to conform to the same validation rules as federal DCAT? that would be hard if there are such fields as bureauCode or programCode that have a federal focus. Perhaps the definition of those fields can be extended to include all agencies/programs that share data through Data.gov?

gbinal commented 10 years ago

Just to check, it seems like an initial version of this would be as simple as noting that the following fields do not apply to non-federal agencies:

I don't necc. know what form that would take but am figuring that that's the meat of it. I imagine it's more complicated than that, though.

nsinai commented 10 years ago

+1

@gbinal those look right

Is there a minimum? E..g. a similar structure with required, required if applicable, etc.

ianjkalin commented 10 years ago

As a benchmark for non-Federal participation, check out these automatically generated data.json files from city, county and state open data portals:

https://data.raleighnc.gov/data.json https://data.montgomerycountymd.gov/data.json https://data.ny.gov/data.json

You'll notice that each government is choosing to use their own metadata template. But there are a good deal of common-sense overlaps with the Federal metadata standard.

jpvelez commented 10 years ago

Forgive me if this has already been discussed elsewhere, but what are the potential uses of this open data metadata standard? The best way to prioritize what fields should be in or out is to think about who need to use them.

Here's a few use cases I'm familiar:

EDIT: looks like the conversation and work are incredibly far advanced, and the fields are largely settled, which is awesome. This can just be a roundup of work that's proposed or being done with this kind of metadata.

philipashlock commented 10 years ago

@jpvelez Thanks for sharing those! Your comment is essentially a first draft of content that could comprise a new section of the Project Open Data website around case studies or opportunities for the use of the schema. @gbinal Perhaps we can carve out a heading on the frontpage where it would make sense to put things like this?

rebeccawilliams commented 10 years ago

Of note: we discussed this briefly at Thursday's Common-Core Metadata Schema Review (see #325) where an alternative schema for non-federal sources was generally +1'ed, with a particular emphasis on removing federally focused fields for non-federal sources.

Left unresolved was: should additional fields be required for non-federal sources (e.g. license)?

gbinal commented 10 years ago

@philipashlock - agreed. I think linking to a state/local section from the homepage makes sense.

gbinal commented 10 years ago

Also, I recommend taking the question of any additional required fields for non-federal sources and treating that as a separate issue from this.

rebeccawilliams commented 10 years ago

@jpvelez @philipashlock adding the new Metadata: Existing Practices and Survey from the DataSF Resources page to the list of resources here and I'm happy to chat additions in a new Issue.

mhogeweg commented 10 years ago

A discussion about metadata practices is incomplete without the work done by FGDC, the work at states in the GIS Inventory, and the international metadata initiatives like INSPIRE (Europe), GEMINI (UK), ANZLIC (Australia/New Zealand), ...

gbinal commented 10 years ago

Thanks for the pull request, @philipashlock. Do you think that it would be sufficient to address this issue?

philipashlock commented 10 years ago

@gbinal I think it's most of the way there. I think it'd be useful to also have a separate section the provides more background for non-federal sources on different requirements and I'll update that (now as part of the v1.1 branch - thanks for accepting the PR). I'll also make an update as part of v1.1 to move programCode and bureauCode up to the required section now that we've called out the federal USG specific fields.

gbinal commented 9 years ago

Thanks everyone for driving the conversation around this issue and helping to assemble the v1.1 metadata update.

There appears to be strong consensus around this issue, which has been accepted in the v1.1 update and merged into Project Open Data. Project Open Data is a living project though. Please continue any conversations around how the schema can be improved with new issues and pull requests!

It's important for government staff as well as the public to continue to collaborate to make the Open Data Policy ever better. Though the v1.1 update is a substantial update, future iterations do not have to be, so whatever your ideas - big or small - please continue to work with this community to improve how government manages and opens its data.