schemaorg / suggestions-questions-brainstorming

Suggestions, questions, and brainstorming
20 stars 15 forks source link

Add mechanism for describing uncertain dates #219

Open danbri opened 9 years ago

danbri commented 9 years ago

Migrated in from https://www.w3.org/2011/webschema/track/issues/16

Currently only exact dates and date ranges are possible. There are situations where something vaguer is needed to match existing data and apps:

    Use cases.

    1. Cultural heritage information (Museums, Libraries, Archives)

    " The current processing rules in the specification do not handle many valid ISO8601 dates. As dates and ambiguity about dates is important for describing cultural heritage materials, hopefully the HTML5 processing rules can be adjusted to handle more valid ISO8601 dates. It seems as if the WHATWG has accepted a proposal to support year only dates, which is a start." [...]
    "Another kind of property a cultural heritage organization might like to add to a landmark or building like the Memorial Tower are the events related to the building. In this case the cornerstone was laid in 1922 and the tower dedicated on November 11, 1949. Other buildings could have events in their history like the dates they were designed or dates of renovations, derived from the drawings and project records. Museums may be interested in various events in the history of a painting including provenance and restorations. History museums and historical societies may also want to refer to various historical events that relate to their exhibits.", 
    Jason Ronallo in http://journal.code4lib.org/articles/6400HTML5 Microdata and Schema.org (Code4Lib Journal, Issue 16, 2012-02-03, ISSN 1940-5758).

    2. Historical data (e.g. geneaology)
    "Schema.org, Microformats, and Microdata all rely on the ISO 8601 date format, which does not support approximate dates. Review the Library of Congress's Extended Date and Time Format, which extends ISO 8601 and allows uncertain and approximate dates." [...]
    http://historical-data.org/

    Design options

    The historical-data.org initiative (whose schema is a strong candidate for inclusion at schema.org once such integration issues are resolved, see also http://www.w3.org/wiki/WebSchemas/HistoricalDataSchema ) suggests use of the Library of Congress "Extended Data and Time Format".

    See http://www.loc.gov/standards/datetime/

    Review comments on the suitability of this approach for schema.org's handing of imprecise dates would be useful.
danbri commented 9 years ago

Early discussion - https://lists.w3.org/Archives/Public/public-vocabs/2012May/0069.html

Recently - https://lists.w3.org/Archives/Public/public-vocabs/2015Feb/0122.html

ppKrauss commented 9 years ago

Another use case and sources...

Perhaps the most common occurrence/demand of this kind of date is the bibliographic citation. In this context, and in particular in Science, the most used standard (and a big "aggregator of standards") for scientific literature, today, is JATS:

Into the JATS recommendations, there a good definition for year and its "bibliographic use" (also reuse for other metadata),

and there are also a related essay,

In some books or older manuscripts, the lower case “c” could also stand for “circa”, meaning approximate. Similar information might be indicated by the prefix or suffix “approx.” or the prefix “between”. Such terms should be preserved similarly; they should be left in the text for mixed citations and placed in comments for element citations

(see JATS mixed citations)


For more specialized (and rare) uses, as in archaeological works and museums, perhaps the best practice is a different approach... Here an example, of artefact's record (museum database),

https://finds.org.uk/database/artefacts/record/id/586009

the source of date ranges (template),

Date from: Circa AD 75; Date from certainty: Certain; Date to: Circa AD 125

is

  <!-- see https://finds.org.uk/database/artefacts/record/id/586009/format/xml
  -->
   <broadperiod>ROMAN</broadperiod>
   <numdate1>75</numdate1>
   <numdate2>125</numdate2>
   <dateFromCertainty>1</dateFromCertainty>
    <dateToCertainty>1</dateToCertainty>

so, the database schema uses date-range (not unique year), "broadperiod" and "dateCertainty"... Then, the "archaeological/geological specialized approach" is to add more properties.

Dataliberate commented 9 years ago

I suspected that raising this again would open up the can of worms that is dates!

I hoped that my lightweight suggestion may introduce a pragmatic possibility for, as Wes put it, fuzzyDates whilst the community continued to admire the much larger problem of dates for all periods and specialities.

From my point of view if we try to go for a solution that will solve both the basic dates (birthdate of your favourite TV star) through to geological time periods, taking in bibliographic/museum uses and no doubt Klingon Stardates on the way, we will end up with something so complicated that 'normal' folks will never get their heads around it. That is if we ever got agreement.

My suggestion was for a simple subtype of Date to get the 'Circa' qualities I was looking for. Maybe that could be an approach we could build upon to to segment, and therefore simplify, the large number of concerns addressed in this thread.

So, in addition to my suggestion, we could create other Date subtypes to handle such specialisations eg. geologicalDate, julianDate, chineseDate, astronomicalDate, etc. I see a few benefits from such an approach - 1. It would work wherever Date is found now. 2. It would allow the delegation of specialised date description to groups that understand and are passionate about. 3. It will enable a flexible approach to implementation, removing the need for everyone to agree on everything before moving forward in this area.

thadguidry commented 9 years ago

+1 for Richard's suggestion of a simple subtype of Date for now to get the 'Circa' qualities on uncertain and approximate dates.

ppKrauss commented 9 years ago

Yes, reuse and simplicity also have my vote (!).

I was sent to the list (just a minute ago!) the new topic "suggestion of something like geoDate and geoDuration" for this separation of "general use circa-dates" (have a name? fuzzyDate or circaDate?) and specialized ones. The geological specialized ones need also the geologicalDuration.


I don't understand the Schema.org organization (I have seem a little mess), but I think is very important to remember (insisting) the use cases. With use cases we can discuss "looking for consensus", and the use cases are starting points for semantic formalization...

ppKrauss commented 9 years ago

Try to organizing: copy/paste and format here the Jeff's examples

Use cases for the proposed circa property

schema:circa
    a rdf:Property;
    rdfs:comment "Approximate date";
    schema:domainIncludes schema:Thing;
    schema:rangeIncludes schema:Date, schema:Duration, schema:Event.

Here are some specific examples of how schema:circa and schema:Role could be combined to indicate approximations of other date properties in schema:

:A0  (museum use case)
                a schema:Painting;
                schema:name “Saint Praxedis”;
                schema:circa [
                                a schema:Role;
                                schema:roleName schema:dateCreated;
                                schema:circa “1655”;
                ];
                schema:sameAs entity:Q94006;
                .

:A1 (commom and History use cases)
                a schema:Person;
                schema:name “Socrates”;
                schema:circa [
                                a schema:Role;
                                schema:roleName schema:birthdate;
                                schema:circa “-469/-468”;
                ];
                schema:deathDate “-399”;
                schema:sameAs entity:Q913;
                .

:A2 (commom and History use cases)
                a schema:Organization;
                schema:name “Trade union”;
                schema:circa [
                                schema:Role;
                                schema:roleName foundingDate;
                                schema:circa :A3;
                ].

:A3  (commom and History use cases)
                a schema:Event;
                schema:name “18th century in Great Britian”;
                schema:sameAs entity:Q6418949;

:A4  (archeological use case)
                a schema:Painting;
                schema:name “Rock art, The Great Fishing God of Sefar”;
                schema:circa [
                                a schema:Role;
                                schema:roleName schema:dateCreated/archeological;
                                schema:circa “5th millennium BC”;
                ];
                rdfs:subClassOf entity:Q1211146;
                .
:A5  (archeological use case)
                a schema:Organization; <!-- ? civilization, old city? -->
                schema:name “Natufian culture”;
                schema:circa [
                        a schema:Role;
                        schema:roleName schema:Duration/archeological;
                        schema:circa “3.2kyr”;  
                        <!-- from 13,000 to 9,800 B.C., diff=3200=3.2k -->
                ];
                schema:sameAs entity:Q733489;
                .

Please EDIT HERE to add/change/etc. the use cases.

Version history:

dr-shorthair commented 9 years ago

A4 and A5 are not "geological', rather 'archeological'.

ppKrauss commented 9 years ago

Thanks @dr-shorthair , corrected to "archeological" (!).

ppKrauss commented 9 years ago

Also corrected to v1.1.1 proposal adding user-specializations, schema:dateCreated/archeological and schema:Duration/archeological... but the suggestion here (?!) is to to unify geo and arch and etc. (scientific areas) to a only one specialized type for "long long time" representation. So the correct is to elect here new types.

Less informal suggestion for a v1.2, a specialization proposal (see A4 and A5 use cases):


Date/Archeological is not a specialization of Date (!), because specialized Date values must be a sub-set of Date values. As Date values must "in ISO 8601 date format", the geological and archeological times need a really new datatype... See schemaorg/schemaorg#371 suggestion.

ppKrauss commented 9 years ago

Can we use this proposal? http://wiki.goodrelations-vocabulary.org/Documentation/Quantitative_values

mfhepp commented 9 years ago

From the top of my head, I would not suggest using schema:QuantitativeValue for this. Instead, let's fix the general handling of date information, including recurring dates, and maybe implement a similar value/minValue/maxValue pattern there instead of extending the range of value/minValue/maxValue to non-numeric datatypes.

dr-shorthair commented 9 years ago

Need to be careful not to confuse the question of handling temporal position (date/time values) with the largely separate issue of temporal topology (intervals, recurrence). There are different kinds of uncertainty for each of these. Allen's interval algebra proposes that instants are just zero-length intervals, and OWL-Time provides a basic RDF compatible treatment. Inside OWL-Time there is a requirement for denoting temporal position, and OWL-Time takes the obvious option of re-using the XSD temporal primitives, but therefore strictly does not handle non-gregorian date/time values - e.g. archeological and geological values. I have a paper in press at Semantic Web Journal [1] that proposes a small refactoring of OWL-Time to resolve this, and it is likely that OWL-Time will be revised in the context of the W3C/OGC Spatial Data on the Web working group.

[1] http://www.semantic-web-journal.net/content/time-ontology-extended-non-gregorian-calendar-applications-0

elf-pavlik commented 9 years ago

:+1:

seeAlso

ppKrauss commented 9 years ago

About the good @elf-pavlik contextualization:

... In some contexts the "circa point" can be expressed/rounded as a precise-interval. So, a lot of examples and use cases can be "rounded" to precise-interval-description, and avoid the fuzzy-description.

jjmhtp commented 6 years ago

See also the solutions with different properties of the CIDOC Conceptual Reference Model, e.g. crm:P117_occurs_during!

chaals commented 6 years ago

@ppKrauss :

Date/Archeological is not a specialization of Date (!), because specialized Date values must be a sub-set [...] "in ISO 8601 date format", the geological and archeological times need a really new datatype... See schemaorg/schemaorg#371 suggestion.

Yup. Actually it seems more like we need a supertype for dateInformation that includes both Date as we have it, and things like "Ming Dynasty in China", "Cumbrian period", "the first 24 nanoseconds after the big bang", "during the decline of the Weimar Republic" and "a few weeks before I wrote this", or "when I was young". How to express these vague dates is tricky, since we want them to be useful enough for people who work with them all the time in a formal way (archeology, history, physics, paleography, astronomy, etc) and probably normal people who are talking about things they did without wanting to name a specific date or date range as the defining temporal anchor.

thadguidry commented 6 years ago

Let me just preface my reply to all the preceding detours on this smallish issue that seems to grow as time goes on... I've done most of the research already for this issue, just for everyone's information. Time representation was a bit dear to my heart. BUT I am not an authority ! What I did find is that all authorities seem to differ on a few lower level concepts in their Time Reference Systems (TRS).

@jjmhtp Not just CIDOC, its others as well. This is all handled at the highest level with the concept of a "Temporal Entity" https://www.wikidata.org/wiki/Q26907166

This issue of "uncertain date" or "fuzzy date" is a solved problem within other TRS's (Temporal Reference Systems). (incidentally, I'm the one who actually did the linking of W3C Time, CIDOC's Temporal Entity, and CISRO's Temporal Object in Wikidata for Temporal Entity

@chaals Those are all known in most TRS's as a "period" and a few place emphasis on them at an "era", go figure. Here's a bunch of properties already in Wikidata that deal with "period" and are already loosely connected in the graph to Temporal Entity https://www.wikidata.org/w/index.php?title=Special:Search&limit=500&offset=0&ns120=1&search=period (but I and others have many more connections to apply)

I agree with @dr-shorthair that Time in general all gets tricky sometimes because some TRS's use different terms to say the same thing and sometimes overlap. Like "period" and others sometimes say "interval"... but in a realist view those are 2 different things.

ANYWAYS. Let's handle 1 issue at a time, because their are tons of issues with Time and tons of opinions, hence the need for many different TRS's which all have different domain opinions.

For "uncertain dates", the parent class is definitely Temporal Position as @dr-shorthair states.

danbri commented 6 years ago

It appears that ISO 8601 is evolving to address many of the issues covered by the earlier EDTF Library of Congress note. There is a comparison with the (apparently nearly ratified) work at https://github.com/plk/biblatex/issues/656

danbri commented 5 years ago

Looking at this again, I would like Schema.org to tentatively endorse the pattern "1982-11/.." for meaning "// End datetime open." as seen in https://github.com/JohnLukeBentley/open-datetime-standard-bootstrap/issues/2 while we await ISO's official update to ISO 8601. ISO make it difficult for us by hiding their specs away and asking LOC to take down the EDTF note, but it seems that for the specific case of open-ended date ranges (needed e.g. for Dataset description), "1982-11/.." looks like the best current guess. This is not quite the "uncertain dates" issue but a related smaller part.

danbri commented 5 years ago

Actually I see I commented in the wrong issue, this would have been more appropriate in https://github.com/schemaorg/schemaorg/issues/1365 - I will copy my comments over there.

FabianCretton commented 4 years ago

Hi all, I found about this interesting thread these days, and I am wondering if anything has come out of that discussion or has been implemented in the current schema.org ? I did not find any sub-type of schema:Date (as proposed here above) for instance. Thank you for any update. Fabian

RichardWallis commented 4 years ago

This did lead to an update of the description of temporalCoverage which addressed open-ended date ranges.

FabianCretton commented 4 years ago

Thank you Richard for that precision. So do I correctly understand that "uncertain dates", which seems to be very well analyzed in this thread, was finally not really solved except that very specific "open ended" date range - Was the rich analysis presented here simply "abandonned" or do some people still work on that ? If so, it seems that wikibase [1] is the schema that currently went the furthest to address all kind of situations. I am currently analyzing how to facilitate querying all those dates representations (incomplete, uncertain, etc.) in SPARQL, hence my questions here. Thank you for any further input.

[1] https://www.mediawiki.org/wiki/Wikibase/DataModel#Dates_and_times

dr-shorthair commented 4 years ago

Note that in the time interval/period since this discussion began, the OWL-Time Ontology was revised and made into a formal W3C Recommendation. Open-ended intervals are easy - you just omit the end or beginning.

For precision: in the Geologic Timescale we solved this with an additional property :positionalUncertainty whose rdfs:range is time:Duration - e.g. http://resource.geosciml.org/classifier/ics/ischart/BaseFamennianTime http://resource.geosciml.org/classifier/ics/ischart/BaseFamennianUncertainty

github-actions[bot] commented 3 years ago

This issue is being tagged as Stale due to inactivity.

danbri commented 3 years ago

No driving usecase to guide us, so moving to brainstorming repo

RichardWallis commented 3 years ago

See issue #7 for the context of the move from the main Schema.org issue tracker to this repository.