ropensci / codemetar

an R package for generating and working with codemeta
https://docs.ropensci.org/codemetar
65 stars 21 forks source link

Implement DataONE upload utility #4

Open cboettig opened 7 years ago

cboettig commented 7 years ago

Create a simple function to release package to DataONE. Could include the following features:

mbjones commented 7 years ago

This is great. I like the snapshot approach used by Zenodo. The codemeta.json document would represent the metadata and so would need to be inserted outside of the archive file (but probably also be included inside the archive file). Let's discuss exactly how this would work.

I think the Git tag approach should be supported.

cboettig commented 7 years ago

Currently zenodo will import from the GitHub repo and guess the metadata for the archive, primarily from the information exposed for the repo by the GitHub API, leaving you to correct the metadata in the Zenodo web interface after-the-fact (e.g. for me this often involves adjusting the author list to include authors who don't use git but contribute intellectually, and maybe remove authors who just sent some random pull request).

Alternatively though, you can add a zenodo.json file, it will parse that to get the metadata (this uses the same json format that Zenodo will display for you in it's web interface, though this is currently an un-advertised feature since Lars is waiting to swap in codemeta.json functionality instead, but see all current Github examples using zenodo.json. In most of those examples it appears users are basically using this as a mechanism to supplement the metadata record zenodo ultimately creates without having to add it manually after each import. That is particularly important given what I see as Zenodo's killer feature in this space of automatically importing a snapshot and assigning a DOI whenever it detects a new tag/release on GitHub. No need for the user to also remember to update Zenodo when they make a release. (This also applies to zenodo's DOI badges in READMEs, which will automatically update to point to the latest DOI).

I generally like the idea of codemeta.json being part of the archive, I think that fits with Arfon's vision of the cool things you could do if all github repos had such a metadata file in the same format. I think the only trick is to get the identifier into the metadata file before uploading it to the permanent archive (we've discussed the same issue wrt to packageId in EML, though I think the upshot there was to just have multiple identifiers?)