USGCRP / gcis-ontology

Ontology for the Global Change Information System
4 stars 7 forks source link

Definition of a "health dataset" #139

Closed justgo129 closed 8 years ago

justgo129 commented 9 years ago

Revisit definition of gcis:Dataset to encompass health data in addition to those from the NCA3.

Spun off from #84.

xgmachina commented 8 years ago

@justgo129 Do you mean to make updates to the class gcis:Dataset? I think for the instances of 'health dataset', they can be annotated with one or a few keywords relevant to health issues?

justgo129 commented 8 years ago

Yes, @xgmachina. It would be nice to have a definition that works for all sorts of scientific datasets, whether they be health, earth observations, or a mix of both. @rewolfe?

rewolfe commented 8 years ago

This appears to be the current definition:

gcis:Dataset a owl:Class ;
    rdfs:label "Dataset" ;
    rdfs:comment "Any organized collection of data or information that has a common theme. 
Examples include lists, tables, and databases, etc." ;
    rdfs:subClassOf dctype:Dataset , prov:Entity .

This seems broad enough to include health data, or am I missing something?

zednis commented 8 years ago

:+1: I think the current definition is sufficiently broad

justgo129 commented 8 years ago

fine with me. @rewolfe should we remove "of data" from the definition given that we are defining "dataset?"

zednis commented 8 years ago

I would say no to removing "of data" from the definition.

1) It would be odd to define a dataset and then explicitly go out of our way to not mention data in the definition 2) precedence exists. This is the definition of dcat:Dataset : A collection of data, published or curated by a single agent, and available for access or download in one or more formats.

justgo129 commented 8 years ago

Works for me. Closed #139 with no action taken.

rewolfe commented 8 years ago

[Well, I have been trying to respond to this in between other comments, so here's my response for the record.]

The Wikipedia definition of dataset https://en.wikipedia.org/wiki/Data_set includes the word data so I'm okay with including it in the definition. One could argue that "or information" is redundant since data https://en.wikipedia.org/wiki/Data is "information or knowledge [] represented or coded in some form suitable for better usage or processing". But, I'm okay with including "of data or information" since we are not just using data in the "number" sense, but also in the "information" sense. For instance, tables can be rows/columns of numbers, but also may be rows/columns of test (e.g. mitigation stratigies for specific regions).

+1

On Thu, Sep 10, 2015 at 9:49 AM, justgo129 notifications@github.com wrote:

Closed #139 https://github.com/USGCRP/gcis-ontology/issues/139.

— Reply to this email directly or view it on GitHub https://github.com/USGCRP/gcis-ontology/issues/139#event-405969253.

Robert Wolfe, NASA GSFC @ USGCRP, o: 202-419-3470, m: 301-257-6966