ResearchObject / ro-crate-zenodo

RO-Crate uploader for Zenodo
Apache License 2.0
1 stars 0 forks source link

CQ5: Preserve license information #5

Open stain opened 5 months ago

stain commented 5 months ago

RO-Crate requires license to be set on the top level Dataset (and optionally on other files). See https://www.researchobject.org/ro-crate/1.1/contextual-entities.html#licensing-access-control-and-copyright

When selecting a corresponding License in Zenodo there may need to be matching by @id and/or it's name as they have a fixed list accessible from the API https://developers.zenodo.org/#licenses

A full list of known open source licenses is available in https://spdx.org/licenses and in machine-readable format in https://github.com/spdx/license-list-data -- however these assume URIs like http://spdx.org/licenses/Apache-2.0 but RO-Crate at best has https://spdx.org/licenses/Apache-2.0.html (note the subtle difference) as that is what is presented in browser.

More commonly licenses in RO-Crate may have the upstream URI like https://www.apache.org/licenses/LICENSE-2.0 or one of the Creative Commons licenses licenses.

stain commented 2 months ago

May be tricky because licenses that exist can have multiple URIs. SPDX list is bigger list than what is in Zenodo (~300). Perhaps manual testing on the few most common tests but some heuristics still needed here as there is no full mapping.

Zenodo have a list of URIs but it's not spdx so it is not automatic. May want to explore how Zenodo have done codemeta.json support in its github integration

stain commented 1 month ago

Part-done but you have to be very specific. Main focus now is on ro-crate-inveniordm