UTS-eResearch / datacrate

Bagit-based data packaging specification for dissemination of research data with useful human and machine readable metadata: "Make Data Crate Again!"
38 stars 10 forks source link

Support for turning other data packages into DataCrates #32

Closed xrotwang closed 5 years ago

xrotwang commented 5 years ago

DataCrate looks like a useful addition to CLDF (https://github.com/cldf/cldf/blob/master/README.md) a data format for cross-linguistic data, based on CSVW. When specifying CLDF we intentionally left out the "zipped or not" and archiving aspect, considering that there are options like bagit out there. Do you intend to provide support for turning other package formats like CSVW or datapackage into DataCrate semi-automatically? If such support is implemented in python, I could also help with the effort.

ptsefton commented 5 years ago

Hi, sorry about the delay in getting back to you. We're happy to help seed the adoption of DataCrate, datapackage conversion should be super easy, and the CLDF stuff looks interesting. If you'd like to start on one I'm happy to help out. Basically that's needed is something that puts a CATALOG.json file in the root of a dataset, as per the spec / examples, then use Calcyte to generate HTML, bag and zip it.

I'll add datapackage to my list of things we can consider supporting. The problem is with it's extensibility - because it doesn't use linked data principles there's no easy way to support arbitrary metadata that people might add, but I don't know how much of an issue that would be in practice.

xrotwang commented 5 years ago

The package format CLDF uses actually isn't datapackage, but w3c's Model for tabular data and metadata on the web, which we use through (our own) csvw package. So this format does use JSON-LD for metadata.

What makes me interested in DataCrate, is that it may provide a standardized way to put a web catalog on top of dataset collections we currently aggregate using ZENODO Communities.

Anyway, I'll have a play with calcyte (although I'm saddened to see that the python implementation seems to have been discontinued :) ) and see where I get.

ptsefton commented 5 years ago

We could look at resurrecting the python version of calcyte, at least for going from CATALOG.json to HTML.

ptsefton commented 5 years ago

Closing this - please feel free to get back to us if needed.

xrotwang commented 5 years ago

ok. will keep an eye on DataCrates - and the other half of the globe.