Request for new TERN attribute - inferred date

miekeGR commented 2 years ago

This issue is following up from https://github.com/ternaustralia/ontology_tern/discussions/161.

Observations need a tern:resultDateTime; however, the data we have doesn't always call out when an observation was made e.g. dateIdentified, could be on the day the organism was observed in the field or later using special techniques or resources to make the identification. To include the identification observation, we may be forced to infer the observation date from the supplied eventDate; but we need to make it obvious that we have made this inference. Do you think this could be done with a new attribute applied to the identification observation? E.g. something like this:

@prefix skos: http://www.w3.org/2004/02/skos/core# .

http://example.org/inferred-date a ns1:Concept ; skos:definition "An inferred date of observation when the source data has not explicitly suggested a different date for the activity compared to the main event."@en ; skos:inScheme http://linked.data.gov.au/def/tern-cv/dd085299-ae86-4371-ae15-61dfa432f924 ; skos:prefLabel "inferred observation date based on eventDate"@en ; skos:topConceptOf http://linked.data.gov.au/def/tern-cv/dd085299-ae86-4371-ae15-61dfa432f924 .

edmondchuc commented 2 years ago

Hi @miekeGR, are you able to provide an example on how to use this as an attribute? Example, is the value of this attribute a boolean?

miekeGR commented 2 years ago

Hi @edmondchuc , I was leaning towards the Boolean option, but only adding the attribute if the Value was TRUE. How do you do it with AusPlots Rangelands and CORVEG? Do you have a best practice?

edmondchuc commented 2 years ago

We don't provide any information on when we infer the time on certain observations and samplings. Probably the best thing to do is to put some information in the dataset metadata.

See @habacucfm's response in https://github.com/ternaustralia/ontology_tern/discussions/161#discussioncomment-2460168.

I guess if time really did matter at this level of granularity, then the source data would provide it. But we need to keep in mind that we infer time in AusPlots Rangelands and CORVEG only because the data we collect is to understand how the environments in Australia change over a long period of time. I wouldn't recommend inferring any time information if it mattered down to the hour, minute or second.

miekeGR commented 2 years ago

That's the thing, you are right, if the difference in time (date) meant something, the data source would provide the distinction. However, the sticky situation is, observations require a date and if we do not have one directly supplied (i.e. the supplier did not record a distinction between the events) and we infer the date because of how the ontology works vs flat data, then we may be accused of changing data by populating a date field that does not appear in the source.

I do like your suggestion though of adding the inference at the dataset level. Since if the required date is missing, it is likely to be missing in all cases within a dataset rather than some. Even better would be a statement of something like "Where observation dates are not explicitly distinguished from the evenDate in the source data, dates are inferred as being concurrent".

I'm now searching for a good way to communicate this statement at the RDFDataset level that already exists in the TERN ontology.

edmondchuc commented 2 years ago

I'm now searching for a good way to communicate this statement at the RDFDataset level that already exists in the TERN ontology.

I think the most suitable property available in VoiD to communicate this is through usingdcterms:description. VoiD's other properties are not really suitable [void].

The DQV vocabulary looks interesting too [VOCAB-DQV].

Another thought I had was to have an attribute on the observation describing where the result date time came from in the source data.

miekeGR commented 2 years ago

Thanks @edmondchuc I did consider the dcterms:decription, but thought that it might get lost within free text from data submitters. The attribute on the observation comes back to my original idea, but sounds like you are thinking more of a text Value rather than a boolean? The DVQ vocab does look interesting too, for sure.

edmondchuc commented 2 years ago

The attribute on the observation comes back to my original idea, but sounds like you are thinking more of a text Value rather than a boolean?

Yeah, I think if you are willing to annotate the observations with details concerned with the quality of the result date time, then it may be useful to directly express where the value came from in the source data.

It also functions the same as your boolean attribute since you can query the data and see if this specific attribute exists on the observation or not but you have to be consistent to only have this attribute exist if the result date time was inferred.

miekeGR commented 2 years ago

Hi @edmondchuc , I hope you had a lovely weekend.

I would like to update my suggest for a new inferred date attribute. What do you think of:

@PREFIX skos: http://www.w3.org/2004/02/skos/core# .

http://example.org/inferred-date a skos:Concept ; skos:definition "An inferred date for an observation when the source data has not explicitly suggested a different date for the activity compared to the main event. The value should describe the source of the inferred date"@en ; skos:inScheme http://linked.data.gov.au/def/tern-cv/dd085299-ae86-4371-ae15-61dfa432f924 ; skos:prefLabel "inferred observation date"@en ; skos:topConceptOf http://linked.data.gov.au/def/tern-cv/dd085299-ae86-4371-ae15-61dfa432f924 .

edmondchuc commented 2 years ago

Hi @miekeGR, yes I enjoyed my long weekend and visited some family :) Hope you had a relaxing weekend too.

I've created inferred observation date as per your request. Please see http://linked.data.gov.au/def/tern-cv/13677931-2e3f-4c34-9fc7-2a1b4dd8d325. Cheers.

miekeGR commented 2 years ago

Wonderful. Thanks for that @edmondchuc I will close this issue now. Cheers.

edmondchuc commented 2 years ago

HI @miekeGR, the current label "inferred observation date" may be misleading as it sounds like the value is a date rather than text describing where the source of the date value is from.

Do you think we can change the label to something else that better represents the attribute?

Further to this, I would like your thoughts @miekeGR @dr-shorthair @nicholascar on annotating some property value with additional information. Is this useful?

<example-observation> a tern:Observation ;
    tern:resultDateTime "2022-04-01"^^xsd:date ;
    tern:qualifiedValue [
        rdf:value tern:resultDateTime ;
        rdfs:comment "This was inferred from the site visit date" ;
    ] ;
.

miekeGR commented 2 years ago

Ah yes, very good point. I like the alternative you have suggested @edmondchuc

edmondchuc commented 2 years ago

I'd like to get feedback on whether tern:qualifiedValue is a good name for the property and that we are happy with using rdf:value and rdfs:comment in the example.

miekeGR commented 2 years ago

Hi @edmondchuc , I have been playing with this a little and discovered that when using the tern:qualifiedValue in the scenario I have mentioned for Observations, it is necessary to qualify the sosa:phenomenonTime. Does this look right?

<example-observation> a tern:Observation ;
    tern:resultDateTime "2022-04-01"^^xsd:date ;
    sosa:phenomenonTime [ a time:Instant ;
            time:inXSDDate "2022-04-01"^^xsd:date ] ;
    tern:qualifiedValue [
        rdf:value tern:resultDateTime,
        sosa:phenomenonTime ;
        rdfs:comment "Date inferred from the eventDate" ] ;

edmondchuc commented 2 years ago

@miekeGR yes, that's what I would do based on the previous example.

I'm wondering if there are better terms than tern:qualifiedValue and rdf:value.

And also noting here some other modelling patterns such as reification, e.g.

<example-observation> a tern:Observation ;
    tern:resultDateTime "2022-04-01"^^xsd:date ;
.

[] a rdf:Statement ;
    rdf:subject <example-observation> ;
    rdf:predicate tern:resultDateTime ;
    rdf:object "2022-04-01"^^xsd:date ;
    rdfs:comment "This was inferred from the site visit date" ;
.

And another example using RDF* (rdf-star)

<example-observation> a tern:Observation ;
    tern:resultDateTime "2022-04-01"^^xsd:date ;
.

<< <example-observation> tern:resultDateTime "2022-04-01"^^xsd:date >> 
    rdfs:comment "This was inferred from the site visit date" ;
.

These two new examples really depend on the implementation details of the BDR system and if they support reification or rdf-star. Just adding these examples as notes here and not as suggestions. Not all systems and application libraries support rdf-star yet.

miekeGR commented 2 years ago

These are some interesting examples. We ran into a bit of a problem with the initial implementation of tern:qualifiedValue in that there is no rdf:type for the blank node so this does need a bit of tweaking.

The rdf-star example looks quite nice and concise. However, I tested your example of the rdf-star in the Surround TERN validator and it seemed to expect a '.' or '}' or ']' at end of << tern:resultDateTime "2022-04-01"^^xsd:date >>. Perhaps that type of syntax is not recognised yet.

If we were to use the reification method, could it cover both the tern:resultDateTime and sosa:phenomenonTime in the one statement and from more than one observation or is this crazy talk?

Something like this:

<example-observation-1> a tern:Observation ;
    tern:resultDateTime "2022-04-01"^^xsd:date ;
    sosa:phenomenonTime [
            a tern:Instant ;
            time:inXSDDate "2022-04-01"^^xsd:date
        ] ;
.

<example-observation-2> a tern:Observation ;
    tern:resultDateTime "2022-04-01"^^xsd:date ;
    sosa:phenomenonTime [
            a tern:Instant ;
            time:inXSDDate "2022-04-01"^^xsd:date
        ] ;
.

[] a rdf:Statement ;
    rdf:subject <example-observation-1>,
        <example-observation-2> ;
    rdf:predicate tern:resultDateTime,
        sosa:phenomenonTime ;
    rdf:object "2022-04-01"^^xsd:date ;
    rdfs:comment "This was inferred from the site visit date" ;
.

edmondchuc commented 2 years ago

If we were to use the reification method, could it cover both the tern:resultDateTime and sosa:phenomenonTime in the one statement and from more than one observation or is this crazy talk?

Hi @miekeGR, I'm not 100% sure. I've only ever seen reified statements where there's one value for rdf:subject, rdf:predicate and rdf:object. There's nothing stopping you in creating reified statements with multiple rdf:predicate values, but I guess there needs to be consensus with the BDR system if you were to do it like that.

miekeGR commented 2 years ago

That's good to know, thanks @edmondchuc . Do the reified statements have an rdf:type?

edmondchuc commented 2 years ago

Do the reified statements have an rdf:type?

Yes they do. They have the type rdf:Statement.

dr-shorthair commented 2 years ago

rdf:Statement

miekeGR commented 2 years ago

Snap ;-) Thanks @edmondchuc and @dr-shorthair

miekeGR commented 2 years ago

@edmondchuc I think it's ok to close this issue now. How do you feel?

edmondchuc commented 2 years ago

Before we close, I'd like to update the attribute concept we created earlier.

Can we change the label from "inferred observation date" to "observation date comment"?

miekeGR commented 2 years ago

Actually, I would prefer if we call it tern:qualifiedValue. I am also using it to give spatiality to samplings that we don't know where they occurred e.g. sub-sampling to create a specimen may have happened in the field, it may have happened in a lab.

edmondchuc commented 2 years ago

I was actually referring to http://linked.data.gov.au/def/tern-cv/13677931-2e3f-4c34-9fc7-2a1b4dd8d325.

edmondchuc commented 2 years ago

Let's also keep this issue open until I've added tern:qualifiedValue into the TERN Ontology.

miekeGR commented 2 years ago

Oh, I totally forgot about the attribute. I thought we didn't need it anymore if we have the tern:qualifiedValue property to use directly.

edmondchuc commented 2 years ago

Yeah, we don't need it anymore if you're happy with using tern:qualifiedValue. Should I just delete it from the system then?

miekeGR commented 2 years ago

I don't think we need the attribute so I am happy for you to delete it.

ternaustralia / ontology_tern

Request for new TERN attribute - inferred date #176