ProjectDrawdown / solutions

The mission of Project Drawdown is to help the world reach “Drawdown”— the point in the future when levels of greenhouse gases in the atmosphere stop climbing and start to steadily decline, thereby stopping catastrophic climate change — as quickly, safely, and equitably as possible.
https://www.drawdown.org/
Other
218 stars 92 forks source link

Add unit test for zenodo.json schema #102

Closed DentonGentry closed 4 years ago

DentonGentry commented 4 years ago

The software in this repository will likely be used to provide data used in academic publications. It is intended that any papers published acknowledge contributors to this codebase. https://github.com/ProjectDrawdown/solutions/blob/master/.zenodo.json allows developers to specify their name, any institutional affiliation, and other fields in order to automatically generate acknowledgements.

This is a JSON file, following a schema available at https://zenodo.org/schemas/records/record-v1.0.0.json

A unit test should be added to verify that zenodo.json conforms to the schema, and fail continuous integration tests if a pull request would break it.

https://python-jsonschema.readthedocs.io/en/stable/ may be useful for this. ui/tests/test_vega.py is an example already in the tree of using jsonschema to validate a JSON file.

Sunishchal commented 4 years ago

Hi Denton,

Does the scope of this issue also include fixing the zenodo.json object to pass this unit test? I've run jsonschema.validate for the zenodo JSON and identified the following issues:

Is there a particular directory in which this unit test should exist? I would put it in the same place as the .zenodo.json file, but that's the topmost directory for this repo, so I'm guessing we don't want any unit tests there.

My unit test currently reads the JSON directly from the zenodo.org URL provided above. I see that the vega schema is saved in the repo. Do you prefer me to download the zenodo schema to be read locally?

DentonGentry commented 4 years ago

Sure, please update zenodo.json to pass the schema test.

'upload_type' can be removed. I think that was present in the example I followed in http://blog.chrisgorgolewski.org/2017/11/sharing-academic-credit-in-open-source.html, but isn't necessary.

It looks like the Invenio record identifier is an incrementing integer so I guess we can start at 1.

There is a tests/ directory at the top level which holds various integration and system tests, and would be a reasonable place to put this test I think.

Please check in the zenodo schema file, we'd prefer that intermittent network errors not make the tests fail. If zenodo revs the schema and we want to use new fields, we'd check in the updated schema. I'd suggest creating a tests/data directory and put the schema there, as we follow that practice elsewhere like model/tests/data.

Sunishchal commented 4 years ago

Thanks for the input Denton. I found the list of requried items in the JSON: 'required': ['recid', 'resource_type', 'publication_date', 'title', 'creators', 'description', 'access_right']

So far, we have recid and creators. For resource_type, I guess we would pass "software" (similar to our upload_type from before)? The remaining items I will defer to your guidance on. Really curious how the example .zenodo.json from nipype is successfully interfacing with zenodo...

Good call on downloading the schema locally. When I went to the zenodo URL last night to do so, their DB was down for several hours...

DentonGentry commented 4 years ago

How about this? As the access_right is 'open' we also need to add a license field.

Sunishchal commented 4 years ago

Okay, I've got the JSON validation to pass after making the following modifications: "resource_type": { "type": "software" }, "license": { "type": "AGPL-3.0" }

It was complaining that the format you provided above was string, not object, so I wrapped them in their own JSON objects. The only thing I'm unsure about is they keys within these objects. I defauled to using "type" because there was no more detailed spec in the zenodo schema. This seems to not throw any errors, so hopefully we can go with it. Let me know if you foresee issues with this approach.

I'll go ahead and submit the PR now.