bio2rdf / bio2rdf-rest-talend

A RESTful interface to the Bio2RDF network of data.
MIT License
5 stars 1 forks source link

Stop uploading dataset descriptions to CKAN/DataHub as CC-BY #14

Closed ansell closed 9 years ago

ansell commented 9 years ago

The vast majority of the datasets in Bio2RDF are not licensed under CC-BY, but the DataHub upload script https://github.com/bio2rdf/bio2rdf-rest/commit/0f390157e8793d2322d550fd87996d1b4a242b3e has hardcoded that they are all CC-BY licensed, and that they are all "open" in the OpenDefinition.org sense, which they are most certainly not if you look into their licensing conditions one by one.

Please do something intelligent to map the actual licenses to the DataHub scheme, defaulting to "not-open" if the license is unknown or the license is known by it does not fulfill all of the OpenDefinition.org points.

micheldumontier commented 9 years ago

In general, Bio2RDF's datasets are meant to be released as CC-BY, but are also subject the restrictions of the source data. Is there a way to articulate that?

ansell commented 9 years ago

There are two issues which are closely related that need to be addressed.

Does Bio2RDF operate in places that allow copyright on datasets? If so, why would Bio2RDF bother to declare the datasets as CC-BY instead of CC0 as CC-BY will not have any effect on a US-republished-dataset anyway?

If Bio2RDF operates in places that allow copyright on datasets then the original copyright cannot be disregarded and CC-BY substituted instead, so in both cases something needs to change.

micheldumontier commented 9 years ago
  1. I don't think anybody has shown in court that CC-BY is not a valid license on data.
  2. Our data is made available for worldwide use, therefore, it is subject to the laws of where the users live.
  3. In some cases, the original data has a public license (e.g. NCBI data), and so CC-BY is stronger and can hold. In other cases, the original data is more restrictive than CC-BY, but does not have a share-alike clause, so it's possible, provided we meet those restrictions. In other cases, we can imagine that the data is fully restrictive, and the Bio2RDF version may not be redistributed (by Bio2RDF, nevermind downstream users).
ansell commented 9 years ago

Anyway, feel free to push CC-BY onto everything, but try to at least push the original distributors into the attribution clause. No idea how relevant it will be for everyone given the wide variation on the ability to copyright databases, and CC0 would be much simpler (http://pantonprinciples.org/), but if you have chosen CC-BY then try to make it work for everyone.

Not sure what you mean by "does not have a share-alike clause, so its possible". If they say CC-BY-ND, then you can't ethically legitimise republishing it under CC-BY just because databases can't be copyrighted in the US.

micheldumontier commented 9 years ago

The only reason we use CC-BY is to encourage people to cite or at least acknowledge Bio2RDF.