reusabledata / reusabledata

The home repository for the (Re)usable Data Project.
http://reusabledata.org
BSD 3-Clause "New" or "Revised" License
11 stars 9 forks source link

Add GigaDB as a resource #152

Open only1chunts opened 5 years ago

only1chunts commented 5 years ago

Please can GigaDB (gigadb.org) be evaluated? http://gigadb.org/site/about http://gigadb.org/site/term

kltm commented 5 years ago

On first pass, I think that some of the same issues mentioned here may come up: https://github.com/reusabledata/reusabledata/issues/91

There may also be at least two distinct data sets going on here:

Interesting section: "While we will retain our commitment to Open Science, we reserve the right to update these Terms of Use at any time. When alterations are inevitable, we will attempt to give reasonable notice of any changes by placing a notice on our website, but you may wish to check each time you use the website. The date of the most recent revision will appear on this, Terms of Use page."

In disclaimer; can these be isolated? "Some of the data provided from external sources may be subject to third-party constraints. Users are solely responsible for establishing the nature of and complying with any such intellectual property restrictions."

only1chunts commented 5 years ago

We can reword the section on BGI data as that is also released CC0. The section on reserve the right to change our t&c's is just to allow us make changes to them, the commitment to open data will not change. All external stuff is external and therefore cannot fall under our license policy. Hope that helps clarify things?

kltm commented 5 years ago

@only1chunts Thank you for the early feedback. Some of the statements I made above were only bookmarks for myself--my apologies for dragging you into an early discussion. Changeable T&Cs are perfectly normal and would not affect evaluation in many circumstances.

I think the main issues that we are having is: 1) whether something like GigaDB counts as a "coherent" data set that a researcher might use for a particular piece of research, rather than a repository for such research (similar issues with datadryad and gbif), and 2) whether it is in our scope to evaluate such resources. As it stands, while the licensing may be outstanding, there is little in the way of "coherent" access to data classes. For example, something like the Monarch Initiative is an aggregation of many upstream resources, plus some new creative content, creating a "coherent" resource with related downloads and a single API. Something like GigaDB seems to be a slightly different creature, more akin to a very nice library than an encyclopedia or a book. Because of these differences, when applying the rubric to a library-type resource, the results are sometimes a little off. One option would be to say that we don't to that; another would be to either tweak the rubric or create a new path for library-type resources. These are still under discussion and we appreciate the conversation.

only1chunts commented 5 years ago

Its not entirely clear to me what the goal of reusabledata.org is, so I'm looking forward to learning more about it and how we can interact, at the Biocuration2019 meeting in Cambridge. Thanks.

kltm commented 5 years ago

Grossly, to create a metric for "reusability" and apply it to biological/biomedical resources. In the exploration of this space, once we had our initial rule set, we found that some things fit poorly, like "platforms" and "archives", as our initial rule set was tuned a bit more to certain scenarios. It's not to say that these other resources are any less useful or "reusable", but that either our rules or scope need to be tweaked to ignore or support them. Anyways, I'm looking forward to answering any and all questions in Cambridge.