USGCRP / gcis-ontology

Ontology for the Global Change Information System
4 stars 7 forks source link

E8 The use of http://purl.org/spar/cito/cites (cito:cites) #81

Closed xgmachina closed 9 years ago

xgmachina commented 9 years ago

We may not be using http://purl.org/spar/cito/cites (cito:cites) correctly , e.g. https://data.globalchange.gov/report/nca3/chapter/our-changing-climate/figure/projected-change-in-average-annual-precipitation.thtml Specifically its use in the phrase “When citing this figure, please reference NOAA NCDC / CICS-NC.”

https://data.globalchange.gov/report/nca3/chapter/our-changing-climate/figure/variation-of-storm-frequency-and-intensity-during-the-cold-season-november--march

Should we therefore leaves cito:cites as is for this case? Or, is gcis:Cites more appropriate here? https://data.globalchange.gov/gcis.owl#d4e67

xgmachina commented 9 years ago

Property: cito:cites

URI: http://purl.org/spar/cito/cites

cites - A statement that the citing entity cites the cited entity, either directly and explicitly (as in the reference list of a journal article), indirectly (e.g. by citing a more recent paper by the same group on the same topic), or implicitly (e.g. as in artistic quotations or parodies, or in cases of plagiarism).

sub-property-of: http://purl.org/swan/2.0/discourse-relationships/refersTo

xgmachina commented 9 years ago

In GCIS ontology:

gcis:cites a owl:ObjectProperty ;
    rdfs:label "Cites" ;
    rdfs:comment "A person or a publication mentions another publication or person as a reason or an example, or in order to support the statement of itself." ;
    rdfs:subPropertyOf bibo:cites ; 
    rdfs:subPropertyOf dcterms:references .

In bibo ontology (See: http://vocab.ox.ac.uk/cito):

<owl:ObjectProperty rdf:about="cites">
<rdfs:label xml:lang="en">cites</rdfs:label>
<rdfs:isDefinedBy rdf:datatype="http://www.w3.org/2001/XMLSchema#anyURI">http://purl.org/ontology/bibo/</rdfs:isDefinedBy>
<rdfs:comment xml:lang="en">Relates a document to another document that is cited
by the first document as reference, comment, review, quotation or for
another purpose.</rdfs:comment>
<ns:term_status>unstable</ns:term_status>
<rdfs:domain rdf:resource="Document"/>
<rdfs:range rdf:resource="Document"/>
<rdfs:subPropertyOf rdf:resource="http://purl.org/dc/terms/references"/>
</owl:ObjectProperty>
xgmachina commented 9 years ago

Definition of swan:refersTo (See: http://swan-ontology.googlecode.com/svn/trunk/discourse-relationships.owl):

<owl:ObjectProperty rdf:about="refersTo">
            <rdfs:label>refersTo</rdfs:label>
            <rdfs:comment rdf:datatype="&xsd;string">It connects an entity with another entity in an unidirectional way</rdfs:comment>
            <rdfs:subPropertyOf rdf:resource="relatesTo"/>
        </owl:ObjectProperty>   

And definition of swan:relatesTo:

    <owl:ObjectProperty rdf:about="relatesTo">
        <rdfs:label>relatesTo</rdfs:label>
        <rdfs:comment rdf:datatype="&xsd;string">The most generic relationship: it expresses connection between two resources without specifying the nature of such connection</rdfs:comment>
    </owl:ObjectProperty> 
xgmachina commented 9 years ago

Seems cito:cites has a fuzzier meaning than bibo:cites.

zednis commented 9 years ago

I agree that we are using it incorrectly.

  1. It is an object property and we are using it as a datatype property
  2. This is not an instructional comment on how to cite this entity but a statement that this entity cites another entity (publication or person)

I think we should determine what we want to say here (description to user on how to cite this entity or statement that this entity cites another entity) and then we can determine the best course.

justgo129 commented 9 years ago

The latter, a statement that says that one entity cites another: either through a formal in-text citation or from an indirect one like "(source: NCDC)" , "source: NASA / GSFC," or "Updated from NASA (2006)" or even less formal.

justgo129 commented 9 years ago

Examples: "Source: Ma and Zednik (2009)" "(source: NCDC)" "source: NASA / GSFC," "Updated from NASA (2006)" "NOAA NCDC / CICS-NC"

As is evident, sometimes the word "source:" appears in the instance triples, sometimes not.

zednis commented 9 years ago

This is a much more relaxed description than is intended with dcterms:references which gcis:cite extends.

from http://wiki.dublincore.org/index.php/User_Guide/Publishing_Metadata#dcterms:references

ex:myArticle dcterms:references _:articlesReference .
_:articlesReference dc:creator "Black, Carl" ;
                    dc:contributor "White, Stuart" ;
                    dc:title "Black and White"
                    dc:date "1988"^^dcterms:W3CDTF

The examples you have provided show a much more relaxed and human-readable description of a citation rather than triples that semantically represent the citation itself.

justgo129 commented 9 years ago

Nice example. Is there any predicate that can be used informally? I found dbpedia:cites. http://dbpedia.org/ontology/cites

It's a datatype property. "A document cited by this work. Like OntologyProperty:dct:references, but as a datatype property." @rewolfe or @zednis please confirm that this would work and we can finish out this ticket.

justgo129 commented 9 years ago

Let's go with the more relaxed description. @zednis, do you have any suggestions? Would dbpedia:cites suffice?

zednis commented 9 years ago

1) Most of the time when we use cito:cites we are using it correctly (as an object-property). Only occasionally do we use it with literal values.

I think we should keep using cito:cites in the standard case were it is being used to reference a document resource.

2) when we do use it with literal value there is a great amount of variety in what we say. Sometimes we reference a document, sometimes an organization, sometimes a dataset, sometimes what is there is more of a comment...

I am not sure if there is a single property that covers the variety of our literal values (which themselves are edge cases in our instance data)

figure citation value (literal)
http://data.globalchange.gov/report/nca3/chapter/water-resources/figure/projected-changes-in-water-withdrawals "Brown et al. 2013"
http://data.globalchange.gov/report/nca3/chapter/agriculture/figure/crop-yield-response-to-warming-in-californias-central-valley "adapted from Lee et al. 2011"
http://data.globalchange.gov/report/nca3/chapter/human-health/figure/projected-climate-change-worsens-asthma "Sheffield et al. 2011"
http://data.globalchange.gov/report/nca3/chapter/human-health/figure/heavy-downpours-disease "NOAA NCDC / CICS-NC"
http://data.globalchange.gov/report/nca3/chapter/transportation/figure/possible-future-flood-depths-in-mobile-al-with-rising-sea-level "U.S. Department of Transportation 2012 "
http://data.globalchange.gov/report/nca3/chapter/human-health/figure/wildfire-smoke-has-widespread-health-effects "Moderate Resolution Imaging Spectroradiometer (MODIS) instrument on the Terra satellite, Land Rapid Response Team, NASA/GSFC"
http://data.globalchange.gov/report/nca3/chapter/our-changing-climate/figure/separating-human-and-natural-influences-on-climate "adapted from Huber and Knutti"
http://data.globalchange.gov/report/nca3/chapter/land-use-land-cover-change/figure/building-loss-by-fires-at-california-wildlandurban-interfaces "Stephens et al. 2009"
http://data.globalchange.gov/report/nca3/chapter/northwest/figure/observed-shifts-in-streamflow-timing "adapted from Fritze et al. 2011"
http://data.globalchange.gov/report/nca3/chapter/our-changing-climate/figure/ten-indicators-of-a-warming-world "NOAA NCDC based on data updated from Kennedy et al. 2010"

As for http://dbpedia.org/ontology/cites - the property definition states that the literal value should represent a document. I am not sure that this property encompasses the variety of literal values we are currently using with cito:cites in the instance data.

justgo129 commented 9 years ago

Thanks, @zednis. Since the above are edge-cases anyway coupled with the lack of a single property, I'm all for leaving this alone without action. If @rewolfe agrees, I'll close #81.

zednis commented 9 years ago

edit - I just ran a count and the balance between objects and literals as the value of cito:cites statements is much closer than I thought.

literal value: 258 IRI: 265

zednis commented 9 years ago

@justgo129 I think we should act to stop using cito:cites with literal values. From recent ticket conversations I think it is time for us to consider leveraging basic reasoning, and we should fix all cases were object properties are used with literal values if we want to do reasoning.

justgo129 commented 9 years ago

I don't disagree. @bduggan @rewolfe @aulenbac I will defer to you going forward on this ticket given the implications.

bduggan commented 9 years ago

Sounds good, no cito:cites. The examples are basically text that should be included in documents which cite the particular resource. i.e. a "recommended citation", in the same category as the text in the "data citation" portion of this page: http://nsidc.org/data/G02135 Sounds like cito:cites is not the right way to convey this.

zednis commented 9 years ago

To be more clear on my recommendations.

1) I think we should continue to use cito:cites with links to document resources in cases where what we are trying to say is that resource A cites document B (i.e. not a recommended citation) 2) I think we should discontinue using cito:cites with literal values 3) We should consider defining our own datatype property (gcis:recommendedCitation?) for recommended citation text so we can ensure they definition is broad enough to cover the variety of uses observed in our instance data

note - It is not always clear to me when looking at the instance data when a current use of cito:cites is a recommendation on how to cite the given resource or if the citation is a statement of fact that the current resource states a document.

rewolfe commented 9 years ago

+1

On Tue, Aug 11, 2015 at 12:59 PM, Stephan Zednik notifications@github.com wrote:

To be more clear on my recommendations.

1) I think we should continue to use cito:cites with links to document resources in cases where what we are trying to say is that resource A cites document B (e.g. not a recommended citation) 2) I think we should discontinue using cito:cites with literal values 3) We should consider defining our own datatype property ( gcis:recommendedCitation?) for recommended citations so we can ensure they definition is broad enough to cover the variety of uses observed in our instance data

— Reply to this email directly or view it on GitHub https://github.com/USGCRP/gcis-ontology/issues/81#issuecomment-129968037 .

Robert Wolfe, NASA GSFC @ USGCRP, o: 202-419-3470, m: 301-257-6966

justgo129 commented 9 years ago

+1 for recommendation 3. I'd honestly be uncomfortable manually changing the literal objects, and that would be outside the scope of GCIS IMO. Can we just do a gcis:recommendedCitation for everything, literals and URIs alike?

zednis commented 9 years ago

As far as I can tell we are currently using the property for two different purposes

1) to say that the subject cites a document resource 2) to provide a suggestion/recommendation on how to cite the current subject

I think we should continue to use cito:cites for case 1.

I think it would be reasonable to define an annotation property for case 2.

gcis:suggestedCitation
  a owl:AnnotationProperty ;
  rdfs:label "suggested citation" .
justgo129 commented 9 years ago

How would one code case 2 within the turtle templates? Would it be something like the following pseudocode?

% if object ! literal, then predicate = ; % } % { else predicate = ; % }

bduggan commented 9 years ago

On Tuesday, August 11, Stephan Zednik wrote:

As far as I can tell we are currently using the property for two different purposes

1) to say that the subject cites a document resource 2) to provide a suggestion/recommendation on how to cite the current subject

I think we should continue to use cito:cites for case 1.

I think it would be reasonable to define an annotation property for case 2.

Yes, sounds good. cito for (GCIS) URIs but not for literals.

Brian

justgo129 commented 9 years ago

+1. Note that on: http://data.globalchange.gov/report/nca3/chapter/water-resources/figure/projected-changes-in-water-withdrawals.thtml

"cito:cites" is invoked twice. I ran a SPARQL query to investigate the use of cito:cites and it came up with many URIs for webpages with this double invocation. Is there a way to limit the SPARQL query to the upper portion of the turtle, i.e. through "a gcis:Figure ." ? I'd like to quantify the amount of times that a literal is called for the first triple using "cito:cites" rather than the second, since the second will always call a URI - see line 20 of https://github.com/USGCRP/gcis/blob/master/lib/Tuba/files/templates/prov.ttl.tut .

justgo129 commented 9 years ago

I retract that. Line 14 of https://github.com/USGCRP/gcis/blob/master/lib/Tuba/files/templates/figure/object.ttl.tut: which is: cito:cites "<%= no_tbibs($figure->source_citation) %>"^^xsd:string;

is the problematic use of cito:cites. The second use of cito:cites, line 20 of https://github.com/USGCRP/gcis/blob/master/lib/Tuba/files/templates/prov.ttl.tut will always call a URI.

In short, we'll need to use a different object for the first of the two triples using cito:cites, but maintain its use in the second. Looping, etc. which I suggested earlier is unnecessary. Sorry that this just came to me now.

justgo129 commented 9 years ago

I meant "predicate." Apologies for any confusion.

justgo129 commented 9 years ago

Let's go with something along the lines of a property not named "cito:cites" for literals. I wouldn't call it a gcis:recommendedCitation but maybe something like gcis:sourceAttribution.

justgo129 commented 9 years ago

Seeing no objection, please feel free to make the aforementioned changes, @zednis @xgmachina.

justgo129 commented 9 years ago

Would dcterms:source
http://dublincore.org/documents/2012/06/14/dcmi-terms/?v=terms#terms-source work for case 2? @zednis if that won't work, please create a new predicate for the "Group 2" instances mentioned above. We'll then close #81.

zednis commented 9 years ago

@justgo129 dcterms:source is intended to be used with non-literal values.

http://wiki.dublincore.org/index.php/User_Guide/Publishing_Metadata#dcterms:source

It also is intended to represent a derivation relationship between two resources.

Definition: A related resource from which the described resource is derived. Comment: The described resource may be derived from the related resource in whole or in part. Recommended best practice is to identify the related resource by means of a string conforming to a formal identification system.

I do not think it make sense to use dcterms:source for case 2 - for when we want to use a literal value to provide a description to the user on how we would like the current resource to be cited.

I will create a pull request adding the gcis:suggestedCitation literal property.

justgo129 commented 9 years ago

Resolved #81 through merged #156.