chin-rcip / collections-model

Linked Open Data Development at the Canadian Heritage Information Network - Développement en données ouvertes et liées au Réseau canadien d'information sur le patrimoine
Creative Commons Zero v1.0 Universal
12 stars 1 forks source link

Adding some dating "logic" into the system to infer new data #9

Open chin-rcip opened 4 years ago

chin-rcip commented 4 years ago

Heather Dunn 16.08.2019 I was wondering about how much “logic” can be built into the system to infer new data from existing data For example, we know an object was created in 1912 We know the manufacturer’s name. We don’t know the dates of the incorporation/dissolution of the Manufacturer of that object. But can we infer that since the object was created in 1912, the manufacturer must have been incorporated “before 1912” and dissolved “after 1912”. What happens when the object is created “circa 1912” – does the inference about the manufacturer’s dates become too messy?

Stephen Hart 16.08.2019 For the logic of those dates, I’m wondering now if it’s not possible to add some “occurred before”, “occurred during”, properties between the Events (for instance that the production event occurred after the birth/formation and before the death/dissolution of the actor). I think with those connections between the events, we might be able to add some semantics without too much difficulties.

Philippe Michon 19.08.2019 I don’t think this will be manage at the ontology level, it will be more efficient to build some rules over the model. Something like, if you don’t find a birthdate used the earliest life event with the “before” qualifier. I don’t think we need to add automatically a bunch of new statements for this purpose.

Heather Dunn 29.08.2019 What you say about managing the date logic outside of the model (e.g. as part of the search or display) makes sense. Especially since it’s likely that we may want to tinker with the logic over time… e.g. adjust the number of years of +/- that we mean by “circa” in different eras, for example.

Stephen Hart 1 day ago To reply to Philippe and Heather, I think adding some semantics between the events within the data model won't complicate it too much, and would help if we want to create some timelines visualization, even is we don't have dates recorded.

KarineLeonardBrouillet commented 4 years ago

This is somewhat unrelated to what Heather initially mentioned, but I am wondering how such inferences could be useful when it comes to licenses and copyrights, and whether they should be handled in the model or outside of it? For example, if our agreement with an institution stipulates that X data can only be published 10 years after the creator's death (let us say an address just for this example)? I am sorry if this is not making any sense I am still trying to figure out the contours of what this would involve for us, but in this case, the death date would be published well before and the new status of the address (publishable as opposed to non-publishable) would have to be switched. I think we intended to rely on the museum to update its data in this situation, but there might be patterns that will be recurrent and that we could consider inferring directly?

KarineLeonardBrouillet commented 4 years ago

Also, if we use such rules to manage the inferences we should consider how we would inform the provider of the data of the inferred data that has been generated

VladimirAlexiev commented 4 years ago
illip commented 4 years ago

Thanks @VladimirAlexiev!

  • CRM has a full complement of Allen temporal logic primitives (before, after, during, starts, finishes...)

CHIN is aware of those properties. However, we are not sure where we should use them. I don't think CIDOC CRM has explicit logic between the birth and the death events for instance. Would it be a good idea to implement this kind of logic in our TM?

I would think that most museums do not record the time span of an event by association with another one. Might need to investigate a little more. So if I'm right, we should use those properties only between already-defined entities of our TM.

  • CRM's P82a,b and P81a,b provide the fields to record the results of such computations (started no earlier than, no later than, etc)

We are aware of those properties too, in fact we are using P82a,b systematically in our model. For the moment, we prefer p82 to p81 since most of our data are following the outbounds pattern.

  • I would also record the provenance of calculation, eg in E13 Attribute Assignment

Not sure to understand what you mean by "calculation". Is it related to the right statement validity issue?

  • CRM doesn't have defined classes for Life and Period of Operation. A class for Pursuit (floruit) has been added

Yes, this is something we are investigating, especially in issue #11.

  • Karine, what you said about licenses makes a lot of sense, there was a whole Europeana related project to compute Out of Copyright status based on dates and different country laws

@stephenhart8 Should we start a new issue to keep track of this question?

VladimirAlexiev commented 4 years ago
### Manufacturer
<manufacturer> a crm:E40_Legal_Body;
  crm:P11i_participated_in <existence>;
  crm:P95i_was_formed_by <formation>;
  crm:P99i_was_dissolved_by <dissolution>. # only if we know it doesn't exist anymore
<existence> a frbroo:F51_Pursuit;
  crm:P4_has_time-span <existence/date>.
<formation> a crm:E66_Formation;
  crm:P4_has_time-span <formation/date>.
<dissolution> a crm:E68_Dissolution; 
  crm:P4_has_time-span <dissolution/date>.
# No info about the timespans yet
<existence/date> a crm:E52_Time-Span.
<formation/date> a crm:E52_Time-Span.
<dissolution/date> a crm:E52_Time-Span.

### Artwork
<artwork>      a crm:E22_Man-Made_Object;
<production>   a crm:E12_Production;
  crm:P4_has_time-span <production/date>; crm:P14_carried_out_by <manufacturer>.
<production/date> a crm:E52_Time-Span;
  rdfs:label "circa 1912"; # assuming +-2 years tolerance
  crm:P82a_begin_of_the_begin "1910-01-01"^^xsd:date;
  crm:P82b_end_of_the_end "1914-12-31"^^xsd:date.

### Allen logic
<formation> crm:P116_starts <existence>.
<dissolution> crm:P115_finishes <existence>.
<production> crm:P117_occurs_during <existence>.

### THEREFORE
<existence/date>
  crm:P81a_end_of_the_begin "1914-12-31"^^xsd:date; # existed before at least latest production
  crm:P81b_begin_of_the_end "1910-01-01"^^xsd:date. # existed until at least earliest production
<formation/date>
  crm:P82b_end_of_the_end "1914-12-31"^^xsd:date.   # formed the latest before latest production
<dissolution/date>
  crm:P82a_begin_of_the_begin "1910-01-01"^^xsd:date. # dissolved no earlier than earliest production

### Provenance

<existence/start/provenance> a crm:E13_Attribute_Assignment;
  crm:P140_assigned_attribute_to <existence/date>;
  crmx:property   crm:P81a_end_of_the_begin; # EXTENSION to specify which prop was assigned
  crm:P2_has_type crm:P81a_end_of_the_begin; # OR use P2 to specify which prop
  crm:P141_assigned "1914-12-31"^^xsd:date;
  crm:P17_was_motivated_by <production/date>.

<existence/finish/provenance> a crm:E13_Attribute_Assignment;
  crm:P140_assigned_attribute_to <existence/date>;
  crmx:property   crm:P81b_begin_of_the_end; # EXTENSION to specify which prop was assigned
  crm:P2_has_type crm:P81b_begin_of_the_end; # OR use P2 to specify which prop
  crm:P141_assigned "1910-01-01"^^xsd:date;
  crm:P17_was_motivated_by <production/date>.
VladimirAlexiev commented 4 years ago

@illip @stephenhart8 above I've given an example of "date reasoning" based on Heather's example. It includes Allen relations, reasoning from <production> to <existence> (which is not quite intuitive for P82/P81), and recording provenance with E13. Cheers!

illip commented 3 years ago

Dear @VladimirAlexiev,

Sorry for this response a year after your publication, this issue has gone under the radar. I reread your entire proposal and I believe that this exactly the type of process that will have to be implemented in order to increase the possible inferences, but also to have an additional mechanism to detect errors.

Since the time has passed, there is now a property to identify the targeted property for an E13_Attribute_Assignment, which is good news: P177_assigned_property_of_type

On our side, we must:

  1. Develop the Objects facet in order to have a general portrait of the temporal inferences to be defined.
  2. Better understand how to automatically generate these triples in 3M.

A huge thank you for your contribution and sorry for the delay once again.