epiverse-trace / epiparameter

R package with library of epidemiological parameters for infectious diseases and functions and classes for working with parameters
https://epiverse-trace.github.io/epiparameter
Other
26 stars 10 forks source link

Data license? #300

Closed chartgerink closed 1 week ago

chartgerink commented 1 month ago

Currently epiparameter provides original data, but that data does not carry a license. MIT license covers the code, but in general it is not recommended to do this for data as it is not intended for that (just like CC licenses are not intended for code).

I don't want to discuss about whether data can be copyrighted to begin with (I can if desired 😊 ); I do think that not having a license can make it more uncertain about how it can be reused freely.

I would recommend to use a separate license for the data, specifically a Public Domain Dedication. This is maximally permissive and also the standard for (meta)data that is aimed to be reused and recompiled. CC BY becomes problematic when compiling databases, as keeping track of the origins of all data points quickly overtakes the size of the actual data (especially problematic in larger databases).

I also mentioned this in the Collaboratory forum for the GREP database: https://collab-forum.who.int/t/epiparameter-hackathon-discussion-thread/186/2?u=chartgerink

joshwlambert commented 1 month ago

@chartgerink and I met to discuss the best approach for adding a data license to the {epiparameter} R package. Here I will summarise our discussion and the key points.

joshwlambert commented 1 week ago

From looking into this topic it seems that R packages can either hold a single license, or multiple licenses.

The Writing R Extension website states:

Multiple licences can be specified separated by ‘|’ (surrounded by spaces) in which case the user can choose any of the alternatives.

Therefore this setup does not work for packages in which one license is applied to the code while another is applied to the data.

To circumvent this restriction other packages have separated their code from data, and packaged them separately as an R package and a data package. See https://github.com/ropensci/unconf17/issues/61 for a discussion on package and data licensing.

We will keep the license of the {epiparameter} package as MIT.

For now we want to keep {epiparameter} as a single package with data and code, but also want to address the outstanding issue of data licensing @chartgerink raised in this issue. We are proposing following the convention of {igraphdata} which circumvent this by specifying several licenses in the LICENSE file.

I will add text to the LICENSE file to specify all data in the {epiparameter} package is licensed under CC0.

chartgerink commented 1 week ago

Confirming that @joshwlambert and I agreed on this way forward. This note is primarily for reference as I'm out of office for a few weeks and that people don't need to check in with me prior to updating this. 😊