Open cormacrelf opened 3 years ago
EDTF opens up a couple of new opportunities that are worth considering. The most obviously valuable one appears to be datetimes, but there are a lot of possibilities.
Yes, and basic 8601 dates are still valid.
Little thing: I've never understood the uncertain/approximate distinction, at least as it applies here. Do you?
I think it boils down to the words themselves:
If anything "circa" should be for approximation, not uncertainty.
If anything "circa" should be for approximation, not uncertainty.
So then what should a CSL processor do with an uncertain date?
I had wondered if it should treat both as circa, but I guess we can treat them separately in the spec as well, so that a style could output "1521?" or "c. 1521", or even "c. 1521?"?
Some styles might treat it as meaning the same, but in general I think ca. vs ? sounds reasonable.
Some styles might treat it as meaning the same, but in general I think ca. vs ? sounds reasonable.
Right; so we definitely need to support both explicitly for input (as in edtf) and styles, and of course feature edtf in general prominently in the documentation, once we figure out our plan.
The issue with circa is if you make it synonymous with "approximate", you are left to deal with is-uncertain-date
having to be backwards.
So just to make sure I understand, @cormacrelf:
The issue with circa is if you make it synonymous with "approximate", you are left to deal with
is-uncertain-date
having to be backwards.
You are saying:
is-uncertain-date
is-uncertain-date
and add is-approximate-date
to csl, and update all existing styles to use the latter instead?Obviously that could be a little painful, but not that big a problem (to convert the styles is just a simple replacement).
I had thought we had already implemented approximate and uncertain?
Would be good to clarify. @cormacrelf?
Upcoming CSL-JSON changes include support for EDTF as a date input format. I recently implemented EDTF, and I have some thoughts about how we can make use of its features in CSL.
What EDTF has that we don't
EDTF is a great format for CSL, because we have supported date ranges since forever, and some of the unofficial date formats we use resemble EDTF already. However it adds three new things we did not have before.
X
character to blot them out.2019-07-16T01:57:29Z
.Unspecified date parts /
1999-XX
and friendsYou might think that we could just add terms for
month-unspecified
andday-unspecified
and call it a day. But I think we'd be missing out -- the spec doesn't advertise it very well, but the feature is more expressive than that.There are a few different variations on the
XX
in EDTF level 1. In my opinion the spec should have named them like so:19XX
=> century,199X
=> decade,1999-XX
=> month of year,1999-XX-XX
=> day of year,1999-07-XX
=> day of month. Styles/locales could render19XX
as "20th century" or "1900s" if they so wished! However, given this is academic citation, I'm not sure how useful that would be. If anyone can point to a style that might want special rendering for any of these forms, then it's something we can definitely do.Approximate
We currently have
is-uncertain-date
, thecirca
term, and"circa": true
in CSL-JSON. For reference, EDTF encodes these its uncertainties as?
=> uncertain,~
=> approximate,%
=> both.On a basic level, you could add terms for
approximate
andapproximate-uncertain
, and also addis-approximate-date="issued"
as a conditional test.One complication is that EDTF makes approx/uncertain a property of each end of a date range, i.e. you can have
1999?/2003
meaning (uncertain 1999) to 2003. Our current model is insufficient for that, it can only work with a date as a whole. You could therefore add acertainty
date part as well, which simply renders one of the three terms or nothing, in either the single date or on each end of the range. This would be an improvement over the existing syntax even ignoring the approximate addition.Date time representation
My favourite citation style, AGLC4, now supports citing tweets/forum posts/videos, and requires a timestamp as well as a date. It renders them like so:
I don't think this will be the only one out there. We don't currently support times at all, and I think we should.
A couple of notes about this:
<date-part>
s for each one, but alternatively you could have only one new<date-part name="time" format="..." />
and just tell styles/locales to supply a time format string and reference one of the popular encodings for that.<date-part name="timezone" />
as well.Z
(= UTC) or a +/- UTC offset in hours or hours:minutes. They are really just offsets, not zones.Australia/Melbourne
that's probably enough info to query a list of known abbreviations for that tz at that time of year, DST-wise (but the abbreviations are not nearly as standardised as the tz names).+03:00[Africa/Nairobi]
. Not sure if we'd want that (complicates edtf parsing, is technically a completely new format if we bolt it on after a valid EDTF, so no thanks) but maybe some JSON way of specifying this would help.A defined calendar
AFAIK CSL has never operated within a specific calendar, it just renders what you put in. EDTF uses the ISO 8601 calendar, see my notes here on what that means: https://docs.rs/edtf/0.2.0/edtf/#notes-on-edtf-and-the-iso-8601-calendar-system. (Obviously you would render these in gregorian style generally, ie 0000 renders as 1BC, -0099 as 100BC.) For modern dates, that's the same as we would normally write them, but in some places dates weren't written in the modern Gregorian calendar until the early 1900s (e.g. Russia, 1918). The UK only switched in 1752. That's really not that long ago, especially since some case law/legislation from before then is still cited fairly frequently.
Idea 1: Accuracy of old dates
I don't think you'll find any citation styles which dictate what calendar to write dates in, but that isn't to say that the problem doesn't exist; in fact it is probably part of the problem for historians, since nobody is forcing anyone else to write what kind of date something is. We could tip the scales with a very simple feature: a configuration in a style or a locale (?) which sets the start of the modern era for dates. Any date before this could be rendered with a term for new style dates (e.g.
(n.s.)
), thus forcing people to check that it actually is a new style date.A much more complex feature would be the configurable rendering of dates in other calendars. I'm pretty sure @fbennett had a feature for rendering the oddities of Japanese calendars, but I'm not sure we should require every CSL implementation to do complex calendar maths. It could be an optional thing. If we wanted such a feature, we could make the the Unicode CLDR calendars optional. (Although, CLDR does not include Julian! How did they manage to omit it???)
Idea 2: Days of the week
Again, I don't know if any styles demand this, but until now it has not been technically possible to know which day of the week something is, because CSL didn't define a calendar. If you make CSL calendar aware, you get days of the week for free.
In summary
EDTF opens up a couple of new opportunities that are worth considering. The most obviously valuable one appears to be datetimes, but there are a lot of possibilities.