msDesc / consolidated-tei-schema

TEI Manuscript Description ODD Customisation
https://raw.githubusercontent.com/msdesc/consolidated-tei-schema/master/msdesc.rng
BSD 2-Clause "Simplified" License
16 stars 7 forks source link

Improving use of TEI dating attributes #63

Open adunning opened 1 year ago

adunning commented 1 year ago

At present the schema requires @when, @notBefore, or @notAfter on most dating elements. These are part of att.datable.w3c, i.e. using the W3C XML Schema datatype, which is based on ISO 8601 but with some important differences.

I've encountered a number of problems with our approach:

In the ISO 8601 specification, one can write when-iso="18" to represent the 19th century (a date between 1800–1899); or 196 to indicate the 1960s. It also allows intervals such as 2020/2022 for 2020–22 or 181/185 for the 1810s × 1850s.

ISO 8601-2:2019 helps further with extensions to improve the syntax of imprecise dates, allowing for example 18XX for an unspecified point in the 19th century. It also gives machine-readable equivalents to 'circa' etc, based on the Extended Date/Time Format (EDTF) Specification. TEI have planned support for this in @when-iso.

The radical solution would be to normalize all date attributes to when-iso – it would be easier to write, more precise, and more interoperable (e.g. by avoiding the problem of different definitions of centuries). Whether that is realistic is another question, and we would need to develop guidance on translating parts of centuries into ISO notation; I noted some of these problems in https://github.com/bodleian/medieval-mss/issues/623.

Note that ISO 8601-1:2019+A1:2022 and ISO 8601-2:2019 are available through British Standards Online (requires the university VPN).

holfordm commented 1 year ago

My understanding is that @from and @to are used for continuous periods, as in the example <date from="1863-05-28" to="1863-06-01">28 May through 1 June 1863</date> [⚓︎](https://tei-c.org/release/doc/tei-p5-doc/en/html/ref-att.datable.w3c.html#index-egXML-d54e5495) and also `

Those five years —

1918 to 1923

— had been, he suspected, somehow very important.</p(https://tei-c.org/release/doc/tei-p5-doc/en/html/CO.html#CONADA). So it would only be correct to use those attributes where it was known that a manuscript had been written over a continuous period. It is correct (again as I understand it) to use@notBeforeand@notAfterfor earliest and latest possible dates of a discrete event e.g. writing a manuscript, whether those are known exactly (1457 x 1460) or not (15th century); our guidelines recommend using the@cert` attribute to distinguish the former cases.

holfordm commented 1 year ago

(confusion of the values in @notBefore / @notAfter is easy to do, but there is a schematron rule in place to flag when this occurs)

adunning commented 1 year ago

That is my understanding as well: which I think means that we should technically be using from/to on <provenance> in most cases? Whether it is a helpful distinction or not is another question!

holfordm commented 1 year ago

Indeed - but even with provenance from and to would only be appropriate where the exact duration of a provenance event is known; that is what our guidelines recommend (https://msdesc.github.io/consolidated-tei-schema/msdesc.html#provenance) and has been adopted in many cases; incorrect uses of notAfter and notBefore could certainly be corrected, although I would agree that this wouldn't be a high priority at the moment.