Closed rtroncy closed 3 years ago
We should better represent when the production time is uncertain. A typical word used in the record is circa.
Regex used for uncertain dates:
To those cases, we may want to add "or later", "or earlier", "before" and "after".
Using CIDOC-CRM, we can use the property ecrm:P79_beginning_is_qualified_by attached to the time span like in doremus.
A big difference between DOREMUS and SILKNOW is in timespans sharing among different objects.
ecrm:P79_beginning_is_qualified_by
), we cannot then distinguish to which production it is related.Possible solutions:
ecrm:P79_beginning_is_qualified_by
(improperly), P2 has type
or any other annotating propertyhasPossibleBeginning
, hutime:hasReliableBeginning
, etc. See also this paper. This solution also imply 1.P4 has time-span
for certain dates. Unfortunately, I am not able to find appropriate properties. We could extend CIDOC with has uncertain time-span
and (if needed) before time-span
, after time-span
, in or after time-span
, in or before time-span
(all sub-properties of P4)Timespans to be deleted:
New cases to be parsed (among all unparsed ts ):
or
usually combine consecutive ts. Use the parts as start/end (e.g. 1920s_or_1930s)Other
How to deal with productions with different dates ? E.g. "Designed 1786; Woven 1787–91"
Thanks a lot @pasqLisena for this comprehensive investigation as well sketches of possible solutions. I also did more investigations. A relevant pointer, first, is https://www.loc.gov/standards/datetime/ (the so-called new EDTF format).
I search a lot in the LinkedArts community since they inherit from CIDOC-CRM practices. The LinkedArts suggested model for timespans is described at https://linked.art/model/base/#time-span-details. Read also:
At the moment, I'm hesitating between:
P82a_begin_of_the_begin
, P82b_end_of_the_end
, P81_ongoing_throughout
, P82_at_some_time_within
, etc.Thoughts?
My thoughts:
P82a_begin_of_the_begin
, P82b_end_of_the_end
, etc. will make us crazy when querying. I would keep the current start-end structure.OK, I also agree to keep the Solution 1 and to create "more" URIs identifying timespans, and basically a new URI each time we encounter a fuzzy timespan. I also agree to keep using the simpler start-end time of the timespan and to NOT use the newest P82a
and P82b
properties. I would use the EDTF notation in the label of the timespan.
basically a new URI each time we encounter a fuzzy timespan
How do we create this?
I think that we can use EDTF, or better a transposition of it in order to avoid special URI symbols (like ?
or '/'). Examples
Otherwise, we can return to UUIDs, using EDTF as seed. (but this would invalidate again the ts uris)
I would use the EDTF notation in the label of the timespan.
In this moment, we use skos:prefLabel
("clean" unique label) and ecrm:P78_is_identified_by
(all encountered labels).
https://data.silknow.org/timespan/1970_1979
I propose to keep these two and add a rdfs:label
with EDTF.
I'm wondering if we could not have a mixed approach for generating URIs for timespans, namely:
The advantage of this solution is that a human would also immediately see whether it is certain or not by looking at the URI, but maybe this is over complicated to implement?
+1 for your labeling proposal for the timespan.
The advantage of this solution is that a human would also immediately see whether it is certain or not by looking at the URI, but maybe this is over complicated to implement?
It's not complicated. I proceed to implement this
We now create systematically time spans for production and we try to interpret and define those time spans, in terms of beginning and end, attaching them to centuries when possible. This slide gives a good account of the situation per museum.
We should better represent when the production time is uncertain. A typical word used in the record is circa. Using CIDOC-CRM, we can use the property
ecrm:P79_beginning_is_qualified_by
attached to the time span like in doremus.As quality check, we should count the number of objects that do NOT have a timespan attached to the production and the number of objects that have a timespan but do not have a
time:hasBeginning
which means we do not know yet how to interpret those time spans.We should get rid of the timespan https://data.silknow.org/timespan/null_null (present, e.g. in http://data.silknow.org/graph/paris-musees)
The following query is interesting as it enables to list all the timespans we have created for each museums (results):
Sub tasks:
null
-ish timespans(last point is not including MAD, to be handled in #5)