ipfs-inactive / archives

[ARCHIVED] Repo to coordinate archival efforts with IPFS
https://awesome.ipfs.io/datasets
183 stars 24 forks source link

Archive package file #45

Open jbenet opened 8 years ago

jbenet commented 8 years ago

Following from https://github.com/ipfs/archives/issues/25

@eminence said:

For each archive, we need a standard way to record some metadata with the archive. At the moment, the most important thing to include is licensing information, but we may find other information that we would like to require.

This issue is to track the discussion on this topic. Below is a draft proposal, with two examples. All aspects of this proposal are open for discussion.

  • Metadata should be stored in a file called _Metadata.json. The name is designed so that I'll appear near the top of directory listings.
  • The json object is a dictionary with the following keys:
    • title -- Provides a name for the archive
    • description -- A more verbose description, if needed
    • source -- Lists of URLs where this data came from
    • license -- An array of dictionaries listing the relevant licenses. Each has the following keys:
    • summary -- a brief summary of the license
    • source -- Where to find the license/legal terms in full
    • last_synched -- an ISO 8601 timestamp indicating the last time this archive was updated
  • I think to start "license" and "title" should be required, others can be optional

For two concrete examples, see the metadata for #23 and the metadata for #18

Other thoughts:

Should the metadata include maintainer information? Should the metadata include the script/tool that was used to sync/update the archive? might be useful is the current maintainer goes away CC #5 for related discussion

jbenet commented 8 years ago

@davidar said

:+1:

However, instead of inventing our own format, ideally we could use an existing standard. For example:

jbenet commented 8 years ago

there's other comments (you can go look at them), but from there the most salient thing is:

use OKFN's data-packages format: http://dataprotocols.org/data-packages/

eminence commented 8 years ago

And here is my attempt to apply this format to the RFC archive (see datapackage.json)

https://ipfs.io/ipfs/QmePrRhWnpNrZxkKFsQhd9NsVb3HQ1AvuvbWQhB3hBccYo

jbenet commented 8 years ago

@eminence awesome! looks good