tetherless-world / dco-ontology

Deep Carbon Observatory Ontology
Creative Commons Zero v1.0 Universal
1 stars 0 forks source link

Updating data type concepts and properties #56

Closed mrpatrickwest closed 8 years ago

mrpatrickwest commented 8 years ago

Changed Parameter to DataTypeParameter along with associated properties. Made the data type properties part of the overview property group. Ranked the data type properties for display in the overview tab.

mrpatrickwest commented 8 years ago

@xgmachina Are these the changes for data type?

mrpatrickwest commented 8 years ago

@xgmachina @zednis @olyerickson @am-e

Pull request for the datatype ontology changes. Once we agree on these we can make the changes in VIVO.

mrpatrickwest commented 8 years ago

I am pretty sure that I made all the changes Marshall and I had talked about in Troy in September.

xgmachina commented 8 years ago

Thanks @mrpatrickwest I took a read and all the proposed changes are made. We need to make a few more updates to dco:DataType and dco:DataTypeParameter: Remove the assertions of a dco:Object and a prov:Entity from the definition of the two classes, because there are assertions rdfs:subClassOf dco:Object and rdfs:subClassOf prov:Entity

mrpatrickwest commented 8 years ago

Once we're okay with this change we can make the change on udco first, and then production

xgmachina commented 8 years ago

Yeah, I will make a separate document for all the relevant classes and properties relevant to data type, and send to Ahmed for reference.

xgmachina commented 8 years ago

I think I am now okay will with all the changes to be merged.

zednis commented 8 years ago

@mrpatrickwest being that we will be loading dco.ttl and dco-vivo.ttl through the filegraph soon (I should be able to work on the tickets this week) do you want to hold off on making any changes to UDCO or PROD VIVO via the admin interface and instead push these changes through the filegraph load?

mrpatrickwest commented 8 years ago

@zednis Any current instance data needs to be updated. So if we make the changes in VIVO first all the instance data will be taken care of for us.

zednis commented 8 years ago

Ah, so what is the plan then once the ontologies are loaded through filegraph and there is a change in the loaded ontology?

edit: Will we always have to make changes via the admin interface to fix the instance data?

xgmachina commented 8 years ago

1) The instances of 'dco:Parameter' to be instances of 'dco:DataTypeParameter'; 2) The use of 'dco:hasParameter' to the use of 'dco:hasDataTypeParameter'

mrpatrickwest commented 8 years ago

@zednis that is a very good question.

xgmachina commented 8 years ago

@mrpatrickwest and @zednis , another topic for discussion: we asserted the domain of 'dco:hasDataType' to be 'vivo:Dataset'. I think I did that simply because I want this property shown up in the vivo profile of a dataset. But, according to the vast discussion of the data type work. We want to use data type to annotate not only datasets but also other entities, such as images, videos, etc. Should I remove the assertion of domain for 'dco:hasDataType', and also update 'dco:dataTypeForDataset'?

zednis commented 8 years ago

With VIVO's usage of domain to determine whether to show a property or not we may want to consider when is the best time to change the the domain. I would not be opposed to changing the domain to something broader such as vivo:InformationResource, but it may (likely) will impact our UI. We may need custom forms at a more developed state before changing the domain.

mrpatrickwest commented 8 years ago

InformationResource is also a superclass of publications / documents. Does it make sense to have data types for publications as well?

And if so do we want to call it a ResourceType and a ResourceParameter?

I'd also like to see an example of a DataType for a video or image.

zednis commented 8 years ago

The definition of a datatype we have provided is: A data type is the representation of particular qualities or features that a group of datasets share.

if you replace datasets with 'information resources' in the definition (just an example) in the definition then I think you could have datatypes for images, videos, and other.

Datatypes are really just a way to provide a classifying value to a resource without modifying the OWL/RDFS class hierarchy; this makes it easier for users to contribute their own types.

xgmachina commented 8 years ago

@mrpatrickwest , for images, JPEG, PNG, or GIF, SVG can be examples of data types. For videos, such as mp4, rm or mpeg. +1 @zednis , the instances of dco:DataType are used as annotations to information resources - Similar to the functionality of instances of skos:Concept. The VIVO UI restriction caused by domain and range of properties is really a pain.

zednis commented 8 years ago

@xgmachina shouldn't datatypes go beyond format though? We already have properties for format.

edit: on that note I think basic formats are not good examples of datatypes and not something we should use as examples. RDF is not a good example of a datatype. I think datatype should describe a quality or feature of the resource, not the format that is used for representation of the resource.

For example, in our current datatype browser "Thermodynamics of chemicals and minerals" is what I think of as a datatype. Everything else (RDF, KML, JSON-LD, etc) is not and I think should be removed.

mrpatrickwest commented 8 years ago

A DataType represents a set of parameters that exist within a Dataset, not necessarily all of them. So someone can come in and search for Datasets that contain a particular DataType knowing that the Dataset will contain at least that set of parameters.

So can this exist for an image, or a video?

zednis commented 8 years ago

@mrpatrickwest I am not sure I agree with that definition of a datatype; the definition we currently have specifies shared 'qualities for features' but does not specify parameters.

mrpatrickwest commented 8 years ago

@zednis @xgmachina That is the definition that Marshall and I discussed last time I was in Troy. So we need to decide what the definition is before we can actually represent it in the ontology.

If it is 'qualities for features' then do we need to change the property and class from Parameter to Feature? Is a Parameter a Feature?

mrpatrickwest commented 8 years ago

According to the ontology definition for DataType:

A data type is often treated as a purely syntactic label associated with a variable when it is declared, such as integer, float, boolean, character, and string, etc. Such definitions of 'data type' do not give any semantic meaning to the data types. In the context of this ontology, a data type can include more meaning beyond the syntax, such as who created the data type, the source standard that the data type derives from, the operations that can be done on datasets of that data type, and typical scientific domains, software programs and/or instruments that use the data type, etc. The intent is that any humans or machines that when they encounter that data type can quickly 'understand' or be in a situation to at least 'process' details within the dataset without even downloading that dataset.

And that definition doesn't really define what a DataType is but what properties have been added to it to add more meaning to DataType.

@xgmachina What documents can I look at to see the definition of a DataType

In this document: http://tw.rpi.edu/media/latest/RDA_Adoption_Proposal_RPITWC2015_public.pdf the only "definition" that I see is that the " DTR explores ways to enable data creators to record and make known the implicit assumptions of a dataset. "

zednis commented 8 years ago

@xgmachina which definitions included in our ontology are from the RDA materials and which (if any) are our own formulation? Are there any examples of data types from the RDA literature?

xgmachina commented 8 years ago

The most recent RDA DTR report is accessible at: https://rd-alliance.org/group/data-type-registries-wg/post/draft-output-document.html I did not see a specific definition of data type in that document, although the point is stated that people want to add domain specific meaning to a data type record. I tried to collect a few definitions when writing that data type paper draft:

mrpatrickwest commented 8 years ago

Then I suggest that we change DataTypeParameter to DataTypeFeature, the property hasDataTypeParameter to hasDataTypeFeature

xgmachina commented 8 years ago

Feature has a broader meaning than parameter. But I do not suggest use it to replace parameter. The whole schema with all the properties for data type constitute a description of the features of that data type. Parameter is something we specifically want for the 'spreadsheet' style dataset.

zednis commented 8 years ago

Would "hurricane wind fields" be a valid datatype?

xgmachina commented 8 years ago

@zednis It can be.

xgmachina commented 8 years ago

Are we ok to merge this branch to master now?

mrpatrickwest commented 8 years ago

@zednis what do you think. Even though much of this might change I say we go ahead and merge the changes and work in the other branch.

zednis commented 8 years ago

:+1: with the idea that we move all these into their own namespace soon. Moving these into a datatype-specific namespace will resolve most of my issues with general-sounding properties having very specific usages (e.g. dco:hasUnit being only for datatype parameters)

mrpatrickwest commented 8 years ago

:+1: :shipit: