CSSEGISandData / COVID-19

Novel Coronavirus (COVID-19) Cases, provided by JHU CSSE
https://systems.jhu.edu/research/public-health/ncov/
29.14k stars 18.44k forks source link

Suggest license file #185

Open willhaslett opened 4 years ago

willhaslett commented 4 years ago

Thanks for this work. Have you considered adding an OS license file? Seems that there's money to be made with your data stream, and I would want to have some say in how.

willhaslett commented 4 years ago

My apologies for the above. My point was that open source software without a license is not actually open source (true), but I hadn't looked closely enough to see that there is no software in your repo, just data. Many thanks for this work and for sharing the data.

Bost commented 4 years ago

source software without a license is not actually open source

Good point!

there is no software in your repo, just data.

A proper license wouldn't hurt. You know, just in case... Off the top of my head I'd guess https://en.wikipedia.org/wiki/Creative_Commons_license but(!) I'm not a lawyer.

willhaslett commented 4 years ago

Yes. IANAL either, but Github recommends a CC license for data-only repos: https://choosealicense.com/non-software/

KevOrr commented 4 years ago

I wonder what the sentiments are of those who downvoted the original comment. It'd be nice to actually hear the dissenting opinions, because all one can glean from this is that maybe 7+ people just don't like the idea of attaching a license to this repo.

dreyco676 commented 4 years ago

I'd agree with adding a license. Right now the data usage verbiage at the bottom of the readme is vague and a proper license would clear that up.

Ie it seems like you can't resell, repackage or use the data in a commercial product but it's not clear if this could be use internally in a corporate setting to understand how Supply Chain vendors may be impacted.

hlapp commented 4 years ago

This is a fantastic data collection that could be very useful for many pursuits, both scientific and commercial, both academic and in industry. This is a time of national, in fact global crisis. Data that is not FAIR and Open has had a long history of impeding the progress of science. Does this really need to be continued here. Why not simply release the data into the public domain wherever possible.

The terms of reuse state:

"This GitHub repo and its contents herein, including all data, mapping, and analysis, copyright 2020 Johns Hopkins University, all rights reserved, is provided to the public strictly for educational and academic research purposes.”

Assuming the data aren’t made up in some creative way, i.e., are facts of nature, in a jurisdiction where sweat of the brow does not count for IP eligibility, how can these be legally copyrighted?

A license asserts and retains copyright, hence copyright eligibility must exist in the first place, or otherwise a license just muddies rather than clears the legal waters.

I'd love to contribute to this effort, and there is a virtual biohackathon underway that I am sure would too. But I can't do so if the result is then taken into copyright by Johns Hopkins rather than available to the world at large to use.

cipriancraciun commented 4 years ago

The same topic is being discussed in NY Times dataset repository:

willhaslett commented 4 years ago

As the OP and as the maintainer of a downstream tool that is under the MIT license, https://github.com/willhaslett/covid-19-growth, it would be great if this repo were free-as-in-freedom. That would remove any ambiguity regarding the copyright status of downstream tools.

penyuan commented 3 years ago

The current README states "This data set is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0)" license.

Can the maintainers of this repository please add an actual LICENSE file to the repository with CC BY 4.0? Because there is no LICENSE file, CC BY 4.0 not showing up in the metadata for this repository. GitHub makes it easy to add the license. Having the LICENSE file formalizes the license (which eases downstream applications) and aids machine-readability.