tdwg / esp

Earth Sciences and Paleobiology Interest Group
13 stars 10 forks source link

What is an appropriate best practice for licensing digital resources about paleontological specimens? #6

Open dennereed opened 7 years ago

dennereed commented 7 years ago

dcterms:license recommnds an official license. What is appropriate for academic and/or commercial paleontology?

markuhen commented 7 years ago

Denné,

Paleobiology Database and Neotoma use the CC BY 4.0 license.

Thanks,

Mark

On 3/7/17 8:22 PM, Denné Reed wrote:

dcterms:license recommnds an official license. What is appropriate for academic and/or commercial paleontology?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tdwg/paleo/issues/6, or mute the thread https://github.com/notifications/unsubscribe-auth/ASpsTP8OkJYAu4BUj9-3jd8GM5H0yzp5ks5rjgLpgaJpZM4MWP34.

-- Mark D. Uhen Associate Professor & Associate Chair George Mason University AOES Geology MSN 6E2 Fairfax, VA 22030 Phone: 703-993-5264 Fax: 703-993-3535

dennereed commented 7 years ago

Thanks Mark! What I think would be really helpful here in the docs is a list of commonly used licenses with links to brief, human readable descriptions that would help users choose what is most appropriate for them, such as this one for CC BY 4.0. Or better, a summary table of commonly used and recommended licenses. Let's see if anyone else can comment on or recommend licenses for academic data and them compile the recommendations. I'm going to add that task to the project list.

stanblum commented 7 years ago

Be aware that there is a growing recognition in science generally, not just biodiversity science, that CC-By can be impractical in studies based on a large number of data providers (e.g., > 100). In addition, there is a position (in the United States) that simple facts are not creative works and therefore not copyright-able. (One can argue whether or not basic specimen information is factual or creative ;-)

It has also been suggested that copyright and licensing might be the wrong mechanisms to support the social and scientific norms of good science. In other words, good science practice should include enough information to support reproducibility, including a path back to an archived copy of source data. Original specimen or observation data should include a "path" to original basis of record. Obviously, these norms are still evolving as science adapts to the digital world.

For these reasons, CC0 (CC zero or a commitment to the public domain) is viewed as more practical and more consistent with the goals of open science.

On Wed, Mar 8, 2017 at 11:26 AM, Denné Reed notifications@github.com wrote:

Thanks Mark! What I think would be really helpful here in the docs is a list of commonly used licenses with links to brief, human readable descriptions that would help users choose what is most appropriate for them, such as this one for CC BY 4.0 https://creativecommons.org/licenses/by/4.0/. Or better, a summary table of commonly used and recommended licenses. Let's see if anyone else can comment on or recommend licenses for academic data and them compile the recommendations.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tdwg/paleo/issues/6#issuecomment-285141928, or mute the thread https://github.com/notifications/unsubscribe-auth/AJWaDd5GRHP00bquvtR-V7GH7ZWPbQiGks5rjwDagaJpZM4MWP34 .

dennereed commented 7 years ago

Thanks Stan! Is the concern with CC BY for meta analysis that it is impractical to acknowledge all the data providers? Simple summary of CC0.

stanblum commented 7 years ago

I'm not entirely sure what you mean by meta analysis (data versus meta-data is dependent on perspective, so to speak). But take the example of predicting a species distribution from occurrence data based on specimens. For a species that has representatives in hundreds of collections, you could pull data from a very large number of providers by downloading data from GBIF. If you do something based on multiple species, then the problem grows larger.

Also, here is a another link about CC0, which I found on the GBIF web site https://creativecommons.org/share-your-work/public-domain/cc0/

(I thought GBIF had gone so far as to begin recommending that providers use CC0, but I couldn't find a statement to that effect.)

On Wed, Mar 8, 2017 at 12:56 PM, Denné Reed notifications@github.com wrote:

Thanks Stan! Is the concern with CC BY for meta analysis that it is impractical to acknowledge all the data providers? Simple summary of CC0 https://creativecommons.org/publicdomain/zero/1.0/.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/tdwg/paleo/issues/6#issuecomment-285166474, or mute the thread https://github.com/notifications/unsubscribe-auth/AJWaDQymocJxN3OLAj2sUbCSI89ihOtRks5rjxX7gaJpZM4MWP34 .

pmergen commented 7 years ago

Hi all If you go to the website of the Gbif Bid site and description of eligibilty there is a bit of imformation. They leave the coice of ccby or cc0' Consistent and automated citation is indeed difficult because of not only many authors for many data, but also multiauthors cascading authorship for a same specimen,/dsta.

It is a sensitive subject as scientists and sponsors want to be cited/aknowledged.

Look at my presentation from 2014 in Stockholm. The idea is to define controlled vocabularies to allow automated citations and terms of use. Convincing all to move to cc0 is tge practical approach, but many refuse as tgey want to be cited, some only want non profit re use, want belsteral agreements with the private sector to make money ...

A very efficent way is when providing data to Gbif to couple it with a datapaper which is peer reviewed. So the data can be cco but authors of the data papers are cited as it is best practice in scientific publications. Check for example the biodiversity data journal from pensoft but there are also others out there.

With my best wishes

Pat

Le 8 mars 2017 11:01 PM, "Stan Blum" notifications@github.com a écrit :

I'm not entirely sure what you mean by meta analysis (data versus meta-data is dependent on perspective, so to speak). But take the example of predicting a species distribution from occurrence data based on specimens. For a species that has representatives in hundreds of collections, you could pull data from a very large number of providers by downloading data from GBIF. If you do something based on multiple species, then the problem grows larger.

Also, here is a another link about CC0, which I found on the GBIF web site https://creativecommons.org/share-your-work/public-domain/cc0/

(I thought GBIF had gone so far as to begin recommending that providers use CC0, but I couldn't find a statement to that effect.)

On Wed, Mar 8, 2017 at 12:56 PM, Denné Reed notifications@github.com wrote:

Thanks Stan! Is the concern with CC BY for meta analysis that it is impractical to acknowledge all the data providers? Simple summary of CC0 https://creativecommons.org/publicdomain/zero/1.0/.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/tdwg/paleo/issues/6#issuecomment-285166474, or mute the thread https://github.com/notifications/unsubscribe-auth/ AJWaDQymocJxN3OLAj2sUbCSI89ihOtRks5rjxX7gaJpZM4MWP34 .

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tdwg/paleo/issues/6#issuecomment-285183754, or mute the thread https://github.com/notifications/unsubscribe-auth/ALl0TL0VxOFsbaGjawmEKRzBIQer1nnmks5rjyUbgaJpZM4MWP34 .

stanblum commented 7 years ago

If "digital resources" includes images and 3-D scans, I don't think there are standard recommendations for licensing. I know a lot of institutions think they can recoup costs by licensing images or developing products from content that might be directly "consumable" by the public.

pmergen commented 7 years ago

Dear all

Yes some adopt the model of Cc0 meta data but other licences more restrictive on the image or movie or on the data themselves.

For data/ meta data there are different interpretations which is which as explained by Stan.

If you go to the site of the EU they have some datamodels, ipr helpdek. Some fall automatically in public domain after a certain time but legislation differs and also to which item it refers data, work of art etc.

At the Africamuseum we have a geology department, it is less critical for paleontology, but very much for rocks, minerals, and maps indicating mines ..

If the private sector is involved you have to deal also with patents, exclusivity ect ..

Stan is right you first may have to define in the context what you mean by digital resdources.

If you look at the existing abcd standard with the efg extention schema of Tdwg, you will find a concept for Iprs, terms of use, citations, copyrights which is quite good. This leave the provider of the data the choice and flexibilty to indicate the licencing model and related information to a whole collection but also at unit level.

The role of tdwg might be here to offer a standard to our users to express their licencing and terms of use in a flexible way with controlled vocabularies compatible with the diffrent legislations best practices, while encouraging open sharing.

It is the role of GBIF, one of the major, but not the only user of Tdwg standards to define under their activities which licensing model they revommand.

What do you think?

Pat

Le 9 mars 2017 6:50 AM, "Stan Blum" notifications@github.com a écrit :

If "digital resources" includes images and 3-D scans, I don't think there are standard recommendations for licensing. I know a lot of institutions think they can recoup costs by licensing images or developing products from content that might be directly "consumable" by the public.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/tdwg/paleo/issues/6#issuecomment-285261198, or mute the thread https://github.com/notifications/unsubscribe-auth/ALl0THhfcqHl__a86ErPp2jB8Twnp7XJks5rj5MxgaJpZM4MWP34 .

DimEvil commented 7 years ago

If you wan't the data to be used, then CC0 or CC-BY

This one is still valid: http://www.canadensys.net/2012/why-we-should-publish-our-data-under-cc0

CC-BY 4.0 solves a little bit the the stacking problem, for example, you may use the citation provided by GBIF, which will give you a link to all the different datasets used in a 'compiled' dataset.

There is no problem in licencing the data under CC0 and licencing the images and/or metadata under CC-BY (or any other license)

Anyway, if you want to publish the resources to GBIF, you must choose one of these CC licences (CC0 - CC-BY or CC-BY-NC) where the CC-BY-NC would probably exclude all commercial use of the data.

Chrs, Dimi

dennereed commented 7 years ago

I created an FAQ wiki page to cover licensing. I think its important to capture the discussion in this thread in the documentation, but this is also more of a general Darwin Core issue rather than a paleo issue. Anyone know if this issues been addressed in the QA threads?