cognoma / cancer-data

TCGA data acquisition and processing for Project Cognoma
Other
21 stars 28 forks source link

Licensing of the Xena TCGA Pan-Cancer Data #3

Closed dhimmel closed 7 years ago

dhimmel commented 7 years ago

Currently, the licensing of the TCGA Pan-Cancer Data available via UCSC's Xena Browser is unclear. I've messaged their listserv and will update this issue with any progress.

dhimmel commented 7 years ago

According to Mary Goldman from the UCSC Xena Browser team:

There is no license for our data, only our software.

They seem to intend for other people to reuse their datasets with attribution, although they haven't actually provided a license allowing people to do so. It seems that part of the issue may be uncertainty over the licensing of the TCGA data they incorporate.

Therefore I emailed tcga@mail.nih.gov about the copyright status and licensing of the TCGA Open Access data tier. We'll revisit this issue after hearing back from TCGA.

dhimmel commented 7 years ago

TCGA Open Acess Data Tier is Public Domain

Amy Blum, a Communications Manager at the NCI, responded to my inquiry regarding the TCGA's copyright status. According to her, the Open Access Data Tier is in the public domain. This is great news as it means anyone can use this data however they want, without having to scrutinize its licensing.

dhimmel commented 7 years ago

I'm updating this discussion with a message @maryjgoldman posted to the Xena Browser Mailing List:

The person who is in charge or our licenses is out this week but will be back on Monday. I'm guessing this is no problem and am looking at this license: http://opendatacommons.org/licenses/by/summary/. I will make sure this is on his desk on Monday morning for his review.

@maryjgoldman just checking in. Any progress on the open licensing front?

maryjgoldman commented 7 years ago

@jingchunzhu I believe we agreed to put our processed TCGA data in the public domain as well. This does not apply to our other datasets from other data providers.

jingchunzhu commented 7 years ago

I think the cognoma team contacted TCGA to ask if the public tier data is considered public domain data, and TCGA said yes.

Since Xena team does not put any restriction on people to use our processed version data, the processed TCGA becomes public domain data automatically.

For other datasets, we don't have the personnel resource to contact each source explicitly, so we can't just say xena processed data is automatically public domain.

Jing

dhimmel commented 7 years ago

@maryjgoldman and @jingchunzhu thanks so much for clearing this up. I'll treat the Xena Browser TCGA data as public domain.

You may want to consider adding a field your JSON metadata for TCGA-derives datasets like:

"license": "https://creativecommons.org/publicdomain/zero/1.0/"

This would provide exception clarity on a dataset-by-dataset level!

Cheers, Daniel!