Closed tombaker closed 6 years ago
My comments:
See note_date.md
The comment for http://purl.org/dc/terms/date currently reads:
Date may be used to express temporal information at any level of
granularity. Recommended best practice is to use an encoding scheme, such
as the W3CDTF profile of ISO 8601 [W3CDTF].
Proposed wording:
Date may be used to express temporal information at any level of
granularity. Recommended practice is to express the date, time, or period
according to the YYYY-MM-DD format specified in ISO 8601.
If the full date is unknown, month and year (YYYY-MM) or just year (YYYY)
may be used.
Date ranges may be specified using ISO 8601 period of time specification in
which start and end dates are separated by a “/” (slash) character. Either
the start or end date may be missing.
@jneubert In #15, were you suggesting that this extended usage guidance seems to rule out other perfectly good practices, such as the use of timestamps?
The suggested definition would restrict the current definition: Whereas W3CDTF allows date and timestamp values ("at any level of granularity"), timestamps would be precluded by the suggested wording.
Well, I'm a sloppy reader: The wording includes a reference to "time", but the examples don't. So perhaps add an example of a date value, too?
May we extend the definition text to (see revised suggestion in the next comment):
Date may be used to express temporal information at any level of
granularity. Recommended practice is to express the date, time, or period
according to the YYYY-MM-DD format specified in ISO 8601 [W3CDTF],
which may include a formal time indication.
I think it makes sense to introduce "the YYYY-MM-DD format" and also the more coarse-grained month and year values, because it spares the reader, who is only interested in these values, checking the ISO. But it may be missleading in that it on the first glance seems not to include times (which seems to match with the property name). The wording "formal time indication" suggested here may be taken as a hint to look into ISO 8601 for those who want to add timestamps.
I'd keep a link to [W3CDTF], because the ISO 8601 text is not freely available.
The introduction of [W3CDTF] mentions that
ISO 8601 describes a large number of date/time formats. For example it defines Basic Format, without punctuation, and Extended Format, with punctuation, and it allows elements to be omitted. This profile defines a restricted range of formats, all of which are valid ISO 8601 dates and times. The aim is to simplify the use of ISO 8601 in World Wide Web-related standards, and to avoid the need for the developers and users of these standards to obtain copies of ISO 8601 itself.
https://en.wikipedia.org/wiki/ISO_8601 gives lots of examples of values and required parsing rules we would not want to impose on DCT users.
Therefore, I'd suggest a wording like
Date may be used to express temporal information at any level of
granularity. Recommended practice is to express the date, time, or period
according to the YYYY-MM-DD format specified in ISO 8601 and narrowed
down in [W3CDTF], which may include a formal time indication.
I think the current definition is fine. Additional information should be included in a guidance document.
I am not sure why you wouldn't include the time in the comment. In many cases that I have seen, particularly in Datasets, time is a very important aspect of publication and coverags. It could maybe say:
Recommended practice is to express the date, time, or period according to the YYYY-MM-DDThh:mm:ss.sTZD format specified in ISO 8601 and narrowed down in [W3CDTF].
@jneubert @makxdekkers @kcoyle I believe the ISO WG wanted to ensure that ranges were explicitly covered. W3CDTF covers timestamps but not date ranges.
Of the proposals above, I think @makxdekkers goes in the right direction:
Recommended practice is to express the date, time, or period according to the
YYYY-MM-DDThh:mm:ss.sTZD format specified in ISO 8601 and narrowed
down in [W3CDTF].
However, it does not show an example of ranges, which seems like an opportunity missed because
they can be explained and illustrated so concisely with two examples (1968/2015
and `2006/' - see note_date.md) in a situation where ISO 8601 is not readily available to be consulted.
I interpret "Date may be used to express temporal information at any level of granularity." to include timestamps as well, as fine-granular dates. However, I agree that his should be made more explicit by including time specifically and also give an example how to represent a timestamp. And then consequentially, we should have also examples for ranges (with dates and timestamps?).
From the DCUB mailing list in an email by Juha Hakala:
Currently Date specification references W3C note (http://www.w3.org/TR/NOTE-datetime) which is based on ISO 8601:1988. This first edition of ISO 8601 allowed the century to be omitted from years (encoding YY-MM-DD), and the W3C profile forbids that and gives examples of how date and time are encoded with century. However, in the current version of the ISO standard (ISO 8601:2004) year is always presented as YYYY so the main reason to reference the W3C note has disappeared. On the other hand, the W3C profile has become a Procrustean bed since ISO 8601:2004 is an extended version of the first edition. These extensions are not included in the W3C profile, so it limits the scope of the ISO standard a lot. The current standard allows for instance date ranges and uncertain or unknown dates, which are often needed in cataloguing. Therefore these ISO 8601 features should be explicitly allowed in Dublin Core Date as well.
Alas, DCMI has its own, non ISO 8601 -compliant solution for date ranges. DCMI Period encoding scheme (http://www.dublincore.org/documents/dcmi-period/) was published in 2006, but AFAIK it has not been popular. The recommended encoding
name=The Great Depression; start=1929; end=1939;
name=Perth International Arts Festival, 2000; start=2000-01-26; end=2000-02-20;uses DCSV syntax which is not widely used / supported and the data in this form might not be easy to convert to and from other metadata formats. From interoperability point of view it would be better to recommend the usage of ISO 8601 also for temporal coverage. Currently Coverage does not mention ISO 8601 at all, so from ISO 8601 point of view Date is too strict, and Coverage too liberal.
Note that ISO is developing a new and extended version of ISO 8601, which is again a major extension to the current standard. Metadata standards using ISO 8601 may need to provide guidelines on how to provide date and time information. A simple solution is to refer just the first part of the future standard, since it covers basic features. Extensions in ISO 8601-2 are probably too exotic for most Dublin Core users.
As an aside, DC Usage guide (http://www.dublincore.org/documents/usageguide/elements/) does not follow ISO 8601 to present date ranges in all examples. 1995-1996 is OK, but example:
Coverage="17th century"
may look OK for Anglo-American users, but is not compliant with ISO 8601, and there are still a lot of people (and applications) which would not understand such data.
I'd keep a link to [W3CDTF], because the ISO 8601 text is not freely available.
This would prevent DC from referring any ISO standard :-(.
The problem with W3 profile is it that is based on ancient version of ISO 8601. And it covers only a very small part of the standard. As you can see from
https://en.wikipedia.org/wiki/ISO_8601
there are a lot of features which can be used to describe dates, times and date ranges. All the functionality in ISO 8601 may be too much for a single user, but collectively the needs may even surpass the standard. And if UB wants to specify limits to ISO 8601 usage, the challenge is to agree on features which are relevant. I guess each UB member might produce a different list. Be that as it may, I am pretty sure that there is a need to express ranges and uncertain / unknown dates on DC Date, and currently the standard does not explicitly allow this.
Please note that ISO 8601 ranges are different from what we are used to: 1945/2018 instead of more common 1945-2018. This is a challenge with no simple solution.
Anyway: I believe that Date is a good example of an element where DCMI definition should be made more liberal, since cataloguers have not limited themselves to the W3C profile. And with temporal coverage it might be a good idea to go to opposite direction, make the term definition more precise. It is a bit hard to justify why the guidelines on how to provide date information vary so much from one DC element to another.
As an aside, some of the ideas behind the new ISO 8601 as explained here
https://www.iso.org/news/2017/02/Ref2164.html
by the former chair of the ISO WG.
@juhahakala Thank you very much for your rich and thoughtful contributions. My suggestion would be to discuss the reference to ISO 8601 for dct:date first.
A more strict definition for dct:coverage would, in my eyes, deserve a separate issue. I'm not sure however if we should touch it more than formally (as done in #9) - the data will most probably stay a LocationPeriodOrJurisdiction hodgepot.
To keep track of your valueable hint to DCMI Period encoding scheme, I'll create a separate issue. I suppose a solution to that issue can be postponed.
@ all members of DC UB: Tom has added a directory (password-protected) with the drafts for ISO 8601-1 and 8601-2 (as of 2016-12-30, still work in progress), and the current standard ISO 8601:2004.
Currently, the definition for dct:date is very open:
A point or period of time associated with an event in the lifecycle of the resource.
At least formally, that is only slightly narrowed down in the comment:
Date may be used to express temporal information at any level of granularity. Recommended best practice is to use an encoding scheme, such as the W3CDTF profile of ISO 8601 [W3CDTF].
With the wording "such as", any encoding scheme complies to that recommendation (any other subset of ISO 8601 in whatever version, or even something completely different, e.g. an exotic set of cataloging rules). Of course this is also true for the upcoming full ISO 8601-1/2.
So I think the question is: How can we make the recommendation most helpful for most of the people who are searching advice on how to encode their data (producers)? And how can we make it most useful for those who want to get an idea about the shape of the data they are going for (consumers)?
For me, that leads to two formal requirements:
The contents of the recommendation must be short and concise (otherwise it will not be used by most of the people with "simple and straightforward" questions).
The contents of the recommendation must be available online and open (otherwise it can be used only by very few).
In my eyes, that rules out a plain reference to ISO 8601. W3CDTF was a great workarround, covering the most relevant cases, with some formal standing (everybody could trust that it's a valid subset of ISO 8601), and open.
Now, as Juha points out, ISO has evolved, and I tend to agree to his point that ranges and uncertain/unknown values are so common, that they should be explicitly covered even in a brief recommendation. But what could we use as a replacement for W3CDTF?
For anything beyond the recommendation, we should refer to ISO 8601-1/2, and of course double-check that the recommendation is in line with its latest version. (BTW - @juhahakala I could not find uncertain/unknown values in ISO 8601:2004. If these were still under discussion, that would make it even more difficult).
@ all members of DC UB: Tom has added a directory (password-protected) with the drafts for ISO 8601-1 and 8601-2 (as of 2016-12-30, still work in progress), and the current standard ISO 8601:2004.
I added a WARNING.md
file in that directory to say that the ISO drafts are not to be shared with anyone.
Since ISO 8601:2004 did not meet the requirements of MARC cataloguing, The Library of Congress created Extended Date/Time Profile, EDTF (http://www.loc.gov/standards/datetime/pre-submission.html) which enriched ISO 8601 with properties libraries needed.
As of this writing it is likely that most if not all the additional features in the EDTF profile will be included in the next version of ISO 8601. But for the time being the only way to explicitly allow for instance uncertain dates is to refer to the EDTF specification.
EDTF and W3CDTF and compatible although the former is much richer. So both profiles could be mentioned in order to cover all date/time related needs DC users may have.
I agree with Makx that precise encoding of time has become more important due to e.g. description of research data sets, so either the format or user guide (or even both) should include appropriate examples.
@juhahakala 's suggestion
EDTF and W3CDTF and compatible although the former is much richer. So both profiles could be mentioned in order to cover all date/time related needs DC users may have.
sounds like the way to go. So the wording in the comment could recommend the use of "a published profile of ISO 8601", while the examples section could link to W3CDTF and EDTF as examples for such profiles (the latter perhaps with a hint about the preliminary status in regard to the ISO 8601 process) - besides literal examples for the most relevant cases, of course.
The examples section can be changed more easily later on, without touching the normative sections. E.g., when the new version of ISO 8601 is published and (hopefully) the EDTF profile is updated in accordance to it, the EDTF link in the examples section can be updated, too.
After the publication of ISO 8601-2, we could perhaps add a sentence referencing Annex B (in the draft), that every community is entitled to publish some profile(s) of ISO 8601, without formal approval.
More formal suggestion, based on the discussion above:
Date may be used to express temporal information at any level of granularity.
Recommended practice is to express the date, time, or period
according to a published profile of ISO 8601.
Examples would be two-fold, and link to further guidance:
EXAMPLES
VALUES:
2018
2016-03-04
2017-11-05T08:15:30-05:00
1968/2015
2006/
PUBLISHED PROFILES:
[W3CDTF](https://www.w3.org/TR/NOTE-datetime) W3C Note on Date and Time Formats
[EDTF](http://www.loc.gov/standards/datetime/pre-submission.html) Extended Date/Time Format
REMARKS
- The extended features of EDTF 1.0, such as uncertain and approximate values, have
been incorporated, with some syntactic changes, into the ISO-8601-2 draft.
- The publication of an ISO 8601 profile is not subject to any formal requirements.
- See [ISO 8601 on Wikipedia](https://en.wikipedia.org/wiki/ISO_8601) for more information.
Are there frequent cases which should be covered by the values examples also? In particular, should we add uncertain (1985-04-12?
) and approximate (1985-04-12~
) date examples, which are not codified by ISO yet.
I've edited the comment above to turn it into a formal suggestion for voting, according to today's telecon.
I generally approve the substance of the proposal above, but it is unclear to me whether the REMARKS section is intended to be part of the ISO standard. If so, the text should not say too much about the ISO-8601-2 draft, or about ISO 8601 generally, as the text will quickly go out of date.
The comment for http://purl.org/dc/terms/date currently reads:
Date may be used to express temporal information at any level of
granularity. Recommended best practice is to use an encoding scheme, such
as the W3CDTF profile of ISO 8601 [W3CDTF].
It would be less verbose if the citations were folded back into the text proposed by Joachim:
Date may be used to express temporal information at any level of
granularity. Recommended practice is to express the date, time, or period
according to a published profile of ISO 8601 [ISO 8601], such as the W3C Note on Date
and Time Formats [W3CDTF] or the (proposed) Extended Date/Time Format
[EDTF].
EXAMPLES: 2018
2016-03-04
2017-11-05T08:15:30-05:00
1968/2015
2006/
Where the document would point to:
Note that this more concise version would omit:
@juhahakala How should we cite ISO 8601 in our footnotes?
If ISO standard is cited without version number, the latest version of the standard is intended. So ISO 8601 means at the moment ISO 8601:2004.
With ISO 8601 things get a little bit complicated, since the next version of the standard will be published in two parts. So it is not clear what ISO 8601 means. It may be necessary to cite both ISO 8601-1 and ISO 8601-2 if there is a need to use the extended features in DC as well. But a simple solution to the problem is to cite ISO 8601 and provide link to
https://www.iso.org/iso-8601-date-and-time-format.html
which specifies the status of the standard. Then you can cite also
https://en.wikipedia.org/wiki/ISO_8601
or EDTF. Wikipedia page describes the content of the standard in a general level. Not all details are there, but the text is quite stable and readable. Anyone who needs to know even more can consult EDTF.
W3CDTF link can be kept for backward compatibility, but since ISO 8601 started to require full year (YYYY) the W3C profile no longer provides any added value although people still routinely refer to it.
If you need to cite the current ISO 8601 drafts, you have to use the form
ISO/DIS 8601-1:2017 and ISO/DIS 8601-2:2017
but I would rather cite just EDTF since it contains all the new ISO 8601 features DC users may need, and it will be freely available permanently.
@juhahakala Thank you for clarifying that!
Closing.
@jneubert I agree with the substance of your proposal, as do Joachim, Stefanie, Osma, Antoine, and Karen, but think it is stylistically out of line with the rest of DCMIMT. I suggest that we fold the REMARKS
into the comments, and Tom, Kai, Osma, and Antoine agree. Since it is currently somewhat confusing what is on the table, I suggest you re-post the proposal on which you would like us to vote.
In the current ISO draft, the comment and examples for dct:date read:
Note 1 to entry: Date may be used to express temporal information at
any level of granularity. Recommended practice is to express the date,
time, or period according to a published profile of ISO 8601 [ISO
8601], such as the W3C Note on Date and Time Formats [W3CDTF] or the
(proposed) Extended Date/Time Format [EDTF].
Note 2 to entry: If the full date is unknown, month and year (YYYY-MM)
or just year (YYYY) may be used.
It terms of substance, this is in line with [Joachim's proposal](https://github.com/dcmi/usage/issues/16#issuecomment-4006675050, and Joachim, Stefanie, Osma, Antoine, and Karen approve.
In terms of style, it is in line with Tom's proposed amendment, and Tom, Kai, Osma, and Antoine agree.
As of July 19, we seem close to agreement, but there are three issues:
I don't think that Note 2 is needed. Obviously one only includes the level of detail of the date that is known.
If we drop Note 2, we are left with:
Note 1 to entry: Date may be used to express temporal information at
any level of granularity. Recommended practice is to express the date,
time, or period according to a published profile of ISO 8601 [ISO
8601], such as the W3C Note on Date and Time Formats [W3CDTF] or the
(proposed) Extended Date/Time Format [EDTF].
In terms of substance, this is in line with previous proposals approved by Joachim, Stefanie, Osma, Antoine, Tom, Karen, and Kai.
Apologies, I'm late to the party...
The most common confusion points I come across (often because people are only familiar with XSD date/dateTime) are:
I'm a little uncomfortable pointing to a profile (EDTF) that is (a) in draft (b) has been deprecated. It's unclear if implementing it will provide future data interoperability and/or result in non-conformant data when the new ISO 8601s come out?
I feel W3CDTF is still a reasonable starting point (though it requires people to 'ask around' or do lots of research, including buying the standard, to figure out the other confusion points). We did start looking at defining our own profile, but that didn't work out. So we can only point to what is currently available, though I think we should point to something as a starting point.
I agree this guidance shouldn't be in the specification, but I think it is preferable to be nearby.
Which is a long way of saying... I agree that Note 1 gives readers a solid starting point (with a concern over EDTF). Maybe further guidance could be given elsewhere on the DCMI site - Using Dublin Core Guide - The Elements already mentions date precision, maybe date ranges could be added too?
"I agree this guidance shouldn't be in the specification, but I think it is preferable to be nearby." @staplegun
+1, although so far we don't have an up-to-date "nearby".
Note 1 to entry: Date may be used to express temporal information at
any level of granularity. Recommended practice is to express the date,
time, or period according to a published profile of ISO 8601 [ISO
8601], such as the W3C Note on Date and Time Formats [W3CDTF] or the
(proposed) Extended Date/Time Format [EDTF].
This was approved as the only "Note to entry" for Date.
EXAMPLES: 2018
2016-03-04
2017-11-05T08:15:30-05:00
1968/2015
2006/
Where the document would point to:
PLEASE VOTE ON COMMENT BELOW
See note_date.md
Add comments for http://purl.org/dc/terms/date:
Note:
Currently, the comment reads: