Date fields are currently not very well explained in the documentation. Not sure whether that should go exactly, probably the object model documentation rather than the Getting started guide?
Rough draft:
Date fields
The nature of the data in the collections makes handling dates especially challenging. It's possible that dates are completely unknown, only partial known, or otherwise fuzzy. Often date fields are referring to a date range. Exact notations used in some (date-describing) text fields vary, and mistakes are introduced due to human error.
Date fields in the Collections API are exposed in various ways:
Verbatim dates
Full details in facetable date fields
Reduced details in nested fields
Verbatim date fields
Verbatim date fields usually contain the data as extracted from the Collections management system. Good examples are verbatimBirthDate or verbatimDeathDate:
"verbatimBirthDate": "11 June 1865",
"verbatimDeathDate": "13 April 1935",
These are often of good quality and accompanied by "parsed" date fields containing an ISO date string:
Facetable date fields are primarily designed to be used as facets. A facetable date field contains sub-fields describing aspects of the date, such as century, dayOfWeek, etc. Sometimes these values are labelled Unknown or are just approximations. Facetable dates are usually accompanied by a verbatim date field.
Facetable date fields are still experimental, and you should not rely on the facetable date fields to represent the "truth" about a date. Use the verbatim date fields if you need to convey the true date to a user, but you can use the facetable date fields to assist in data analysis or approximate categorisation.
Here the original value from the collection management system is verbatimCreatedDate: 1906. The createdDate is an ISO date approximation of that value. This is not always a good approximation, especially if there is not enough precision in the original data. The facetCreatedDate contains facetable values for that date, and date range approximations:
century - The century that the original date falls into, here 20th century.
dayOfWeek - The day of the week. Use with caution, as this is currently derived from the temporal representation which does not always have the correct precision level. In this case a more accurate label for day of week would have been Unknown.
decadeOfCentury - The decade of the century, e.g. 1900s.
era - The era, either Common Era (CE) or Before Common Era (BCE).
monthOfYear - The month of the year. Use with caution, as this is currently derived from the temporal representation. See dayOfWeek.
temporal - Usually equal to the "parsed" version of a date, e.g. birthDate. Use with caution.
verbatim - Contrary to the verbatim date mentioned in the last paragraph, this is actually a combination of two date fields that represent a date range: the "earliest" and "latest" date. In our example, the production date was determined to be between 01 Jan 1906 and 31 Dec 1906.
year - Approximation of the year.
Note that interpretation of dates is a fairly complex problem. We do not claim to return entirely reliable values for the facetable date fields within the Collections API, but hope to be able to offer a useful (and machine readable) addition to regular date fields. By making date aspects facetable, we hope to be able assist users with exploring the wealth of data.
Through the advanced search interface you can ask for all available facets on a date:
Nested objects only expose a lower level of detail. On date fields, this is usually visible by the omission of facetable date fields. You can usually retrieve those by requesting the root-level entity directly.
Date fields are currently not very well explained in the documentation. Not sure whether that should go exactly, probably the object model documentation rather than the Getting started guide?
Rough draft:
Date fields
The nature of the data in the collections makes handling dates especially challenging. It's possible that dates are completely unknown, only partial known, or otherwise fuzzy. Often date fields are referring to a date range. Exact notations used in some (date-describing) text fields vary, and mistakes are introduced due to human error.
Date fields in the Collections API are exposed in various ways:
Verbatim date fields
Verbatim date fields usually contain the data as extracted from the Collections management system. Good examples are
verbatimBirthDate
orverbatimDeathDate
:These are often of good quality and accompanied by "parsed" date fields containing an ISO date string:
Facetable date fields
Facetable date fields are primarily designed to be used as facets. A facetable date field contains sub-fields describing aspects of the date, such as
century
,dayOfWeek
, etc. Sometimes these values are labelledUnknown
or are just approximations. Facetable dates are usually accompanied by a verbatim date field.Facetable date fields are still experimental, and you should not rely on the facetable date fields to represent the "truth" about a date. Use the verbatim date fields if you need to convey the true date to a user, but you can use the facetable date fields to assist in data analysis or approximate categorisation.
An example is the
production.facetCreatedDate
:Here the original value from the collection management system is
verbatimCreatedDate: 1906
. ThecreatedDate
is an ISO date approximation of that value. This is not always a good approximation, especially if there is not enough precision in the original data. ThefacetCreatedDate
contains facetable values for that date, and date range approximations:century
- The century that the original date falls into, here20th century
.dayOfWeek
- The day of the week. Use with caution, as this is currently derived from thetemporal
representation which does not always have the correct precision level. In this case a more accurate label for day of week would have beenUnknown
.decadeOfCentury
- The decade of the century, e.g.1900s
.era
- The era, eitherCommon Era (CE)
orBefore Common Era (BCE)
.monthOfYear
- The month of the year. Use with caution, as this is currently derived from thetemporal
representation. SeedayOfWeek
.temporal
- Usually equal to the "parsed" version of a date, e.g.birthDate
. Use with caution.verbatim
- Contrary to the verbatim date mentioned in the last paragraph, this is actually a combination of two date fields that represent a date range: the "earliest" and "latest" date. In our example, the production date was determined to be between01 Jan 1906
and31 Dec 1906
.year
- Approximation of the year.Note that interpretation of dates is a fairly complex problem. We do not claim to return entirely reliable values for the facetable date fields within the Collections API, but hope to be able to offer a useful (and machine readable) addition to regular date fields. By making date aspects facetable, we hope to be able assist users with exploring the wealth of data.
Through the advanced search interface you can ask for all available facets on a date:
This returns a list of all facetable sub-fields of that date, in the result set:
Nested data fields
Nested objects only expose a lower level of detail. On date fields, this is usually visible by the omission of facetable date fields. You can usually retrieve those by requesting the root-level entity directly.