Closed JohnLukeBentley closed 7 years ago
With the latest additions, something is severely broken. I get
! Undefined control sequence.
l.1230 \DeclareLabelalphaNameTemplate
{
?
! Undefined control sequence.
l.1231 \namepart
[use=true, base=true, strwidth=1]{prefix}
?
! LaTeX Error: Missing \begin{document}.
See the LaTeX manual or LaTeX Companion for explanation.
Type H <return> for immediate help.
...
l.1231 \namepart[
use=true, base=true, strwidth=1]{prefix}
?
! Undefined control sequence.
l.1232 \namepart
[base=true]{family}
?
! Undefined control sequence.
l.1233 \namepart
{given}
?
Will it allow [a] date like “13th century”?
If you are asking about output:
year
rather than the date
field, it is rendered as is (IIRC, earlier versions of the biblatex manual used to contain a hint to that effect).12uu
could be rendered as “1200s” or “13th century”123u
as “1230s” or “1230–1239”120u
though, this must not be rendered as “1200s”, the only option I see is indeed “1200–1209”1999-uu
, 1999-01-uu
, 1999-uu-uu
, biblatex could simply drop any “unspecified” parts.@simifilm - I'm in the middle of quite large changes to the labelalpha mechanism - the github source isn't guaranteed to be stable but you can just comment out the \DeclareLabelalphaNameTemplate
declaration in biblatex.def
for now.
After sleeping on it I think, in my previous large post, rather than trying to illustrate:
Human readability and understandability being distinct.
E.g. date = {-0279}
is quite human readable, in that the number can be read to be minus two seventy nine. But it's not readily understandable, especially by those not familiar with date standards, as being equivalent to the year two hundred and eighty before the common era. At least, I mean, this is plausible way of speaking about the example.
By contrast datetime = {2004-01-01T10:10:10+05:00}
is understandable, but not (relative to alternatives) human readable. That is, even folk not familiar with datetime standards could understand, through making some assumptions, what this string means. However, reading it is a bit of an strain given the lack of space delimiters.
So it's with the criteria of human readability and understandability that we might weight candidate datetime standards. That is, in conjunction with other criteria.
For the next post, or next few posts, I'd like to put aside the issue of "Should there be a colloquial format?" by addressing:
Whether or not there is to be colloquial input format, which should be the strict format: EDTF or iso8601:2004?
And given the renewed enthusiasm, from Nick and Philip, of EDTF: I'll look at the matter with EDTF as the leading candidate. That'll entail revisiting some of the issues I've previously mentioned, as well as raising new issues.
Again, this is just an issue of input formats. ...
I'm revisiting this issue partly because I want to make sure I get the two different standards right.
ISO8601:2004 allows the "+" sign for years. I previously quoted "3.4.2 Characters used in place of digits or signs". So ISO8601:2004 allows both of the following formats:
+0001
+0000
-0001
0001
0000
-0001
EDTF. I previously didn't provide evidence of what EDTF says on this issue. So ...
From http://www.loc.gov/standards/datetime/pre-submission.html#bnf
date = year | yearMonth | yearMonthDay
...
year = positiveYear | negativeYear | "0000"
positiveYear =
positiveDigit digit digit digit
| digit positiveDigit digit digit
| digit digit positiveDigit digit
| digit digit digit positiveDigit
negativeYear = "-" positiveYear
...
digit = positiveDigit | "0"
positiveDigit = "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"
So EDFT doesn't allow "+". It enforces a scheme like:
0001
0000
-0001
As I previously argued, allowing the "+" sign for years would be helpful in a scenario where you wanted to line up years (and dates) in a column (as previously exemplified when using Zotero and Zotero-better-bibtex, which export to biblatex).
On the other hand, that EDTF enforces only one scheme, with respect to allowing "+" (it doesn't), may make it attractive.
Therefore I would agree that this issue causes no significant impediment to using EDTF over ISO8601 for biblatex.
On the presumption that Biblatex will support datetimes to some degree (as Philip had indicated it would), or at least there's a strong possibility it might in the future ...
In ISO8601:2004 changeable precision times are allowed:
4.2.2.2 Complete representations
...
Basic format: hhmmss Example: 232050 Extended format: hh:mm:ss Example: 23:20:50
4.2.2.3 Representations with reduced accuracy
If the degree of accuracy required permits, either two or four digits may be omitted from the representation in 4.2.2.2.
a) A specific hour and minute Basic format: hhmm Example: 2320 Extended format: hh:mm Example: 23:20 b) A specific hour Basic format: hh Example: 23 Extended format: not applicable
In EDTF changeable precision times are not allowed:
5.1.2 Date and Time
A date/time string MUST be composed according to one of three representations as illustrated in the following three examples:
2001-02-03T09:30:01 2004-01-01T10:10:10Z 2004-01-01T10:10:10+05:00
- BNF
time = baseTime zoneOffset? baseTime = hour ":" minute ":" second | "24:00:00"
It is somewhat surprising to find EDTF allowing changeable precision in dates, even going so far as to provide several ways to express that variable precision, while affording no such flexibilty for times.
Is this a problem?
An example way in which bibliographic data might be feed into biblatex is via reference management software (like Zotero) that, in turn, extracts metadata from a website. There are a plethora of (X)HTML embedded metadata schemes. Two of the more popular ones are Dublin Core and (the emerging) JSON-LD with Schema.org.
There are many (overly complex) ways to express Dublin Core metadata in (X)HTML5, but one (https://wiki.whatwg.org/wiki/MetaExtensions conforming) paradigmatic example is:
<link rel="schema.DCTERMS" href="http://purl.org/dc/terms" />
<meta name="DCTERMS.title" content="Services to Government" />
<meta name="DCTERMS.modified" scheme="DCTERMS.W3CDTF" content="2016-06-10T20:00:09+1000" />
Note the W3CDTF standard that Dublin Core often uses for datetimes, https://www.w3.org/TR/NOTE-datetime (See http://dublincore.org/documents/2012/06/14/dcmi-terms/?v=terms#terms-W3CDTF). This is W3C's Date and Time Formats Note of 1998-08-27 (which EDTF mentions).
The W3CDTF subsets ("profiles") ISO8601, as does EDTF. But unlike EDTF, W3CDTF allows for times that drop seconds. That is, W3CDTF allows:
YYYY-MM-DDThh:mmTZD (eg 1997-07-16T19:20+01:00; or 1997-07-16T18:20Z)
So metadata coming into Biblatex that ultimately derives from a Dublic Core Metadata element like ...
<meta name="DCTERMS.created" scheme="DCTERMS.W3CDTF" content="1997-07-16T19:20+01:00" />
... will break if Biblatex uses (and enforces) EDTF as its strict format.
However, looking at an almost randomly chosen production website, a national news site which appears to implement Dublin Core Metadata well, they use times in the long form. From http://www.abc.net.au/news/2016-06-10/barnaby-joyce-denies-telling-woman-to-piss-off-in-tamworth-pub/7501090 ...
<link rel="schema.DCTERMS" href="http://purl.org/dc/terms/" />
<meta name="DCTERMS.issued" scheme="DCTERMS.W3CDTF" content="2016-06-10T20:00:09+1000"/>
<meta name="DCTERMS.modified" scheme="DCTERMS.W3CDTF" content="2016-06-10T23:44:31+1000"/>
... There'd be a major website that uses the short time format in Dublin Core, but I can't find one on a quick search.
Google is promoting JSON-LD for metadata in (X)HTML.
You provide structured data markup in your HTML ... pages... JSON-LD is the recommended format. Google is in the process of adding JSON-LD support for all markup-powered features. The table below lists the exceptions to this. We recommend using JSON-LD where possible. https://developers.google.com/search/docs/guides/intro-structured-data
Google notes that when JSON-LD you typically use:
the schema.org vocabulary — an open community effort to promote standard structured data in a variety of online applications.
Schema.org appears to enforce times in long form only:
A combination of date and time of day in the form [-]CCYY-MM-DDThh:mm:ss[Z|(+|-)hh:mm](see Chapter 5.4 of ISO 8601). https://schema.org/DateTime
And here's another almost randomly chosen production example, http://www.bbc.com/news/science-environment-36505748 , that implements JSON-LD with a schema.org time in long form :
"datePublished": "2016-06-11T09:05:50+01:00"
So that's an example of a standard, and use of a standard, in the wild that conforms to EDTF in terms of time precision (the long time is enforced).
I'll show a random sampling of the top newspapers in the US, looking at their datetime metadata (regardless of what scheme it conforms to).
Wall Street Journal. A full time format. http://www.wsj.com/articles/imf-warns-china-of-risks-of-mounting-corporate-debt-1465613146
<meta name="article.published" content="2016-06-11T02:45:00.000Z" />
New York Times. They are all over the shop ... http://www.nytimes.com/2016/06/12/magazine/what-if-ptsd-is-more-physical-than-psychological.html?hp&action=click&pgtype=Homepage&clickSource=story-heading&module=photo-spot-region®ion=top-news&WT.nav=top-news
<meta name="pdate" content="20160610" />
<meta name="utime" content="20160610235833" />
<meta name="ptime" content="20160610050026" />
<meta name="DISPLAYDATE" content="June 10, 2016" />
LA Times. Another full time format. http://www.latimes.com/politics/la-na-pol-democrats-unity-20160611-snap-story.html
<meta itemprop="datePublished" content="2016-06-11T03:00:00-0700" data-meta-updatable />
A summary of the facts:
On this issue the options for Biblatex seem to be:
Do I get it right that if EDTF was used in biblatex it must WARN, not produce a fatal error, for short datetime (e.g. "1997-07-16T19:20+01:00") formats? Would the need for a WARN, rather than fatal error, for short datetime formats critically count against using EDTF in Biblatex?
Should we have an input standard that allow a space between date and time, as in
2004-01-01 10:10:10+05:00
And even a space between time and time zone?, as in ...
2004-01-01 10:10:10 +05:00
I'm revisiting this issue partly to ensure I'm reference the standards right; and it might count as the most critical impediment to EDTF adoption.
As mentioned ISO8601:2004 allows a space between date and time ...
By mutual agreement of the partners in information interchange, the character [T] may be omitted in applications where there is no risk of confusing a date and time of day representation with others defined in this International Standard. (Under "4.3 Date and time of day > 4.3.2 Complete representations")
ISO8601:2004 forbids a space between time and timezone ...
4.2.4 UTC of day
To express UTC of day the representations specified in 4.2.2.2 through 4.2.2.4 shall be used, followed immediately, without space, by the UTC designator [Z]
4.2.5.2 Local time and the difference from UTC
When it is required to indicate local time and the difference between the time scale of local time and UTC, the representation of the difference shall be appended to the representation of the local time following immediately, without space, the lowest order (extreme right-hand) ...
EDTF forbids a space between date and time (From EDTF "8. BNF"), and forbids a space between time and timezone.
dateAndTime = date "T" time
time = baseTime zoneOffset?
baseTime = hour ":" minute ":" second | "24:00:00"
zoneOffset = "Z"
| ("+" | "-")
(zoneOffsetHour (":" minute)?
| "14:00"
| "00:" oneThru59 )
Given that both standards forbid a space between time and timezone, we don't need to consider that factor.
So ISO8601:2004 allows, and EDTF forbids:
2004-01-01 10:10:10+05:00
I imagine all of us would be agree that with the space the above format is more human readable than ...
2004-01-01T10:10:10+05:00
Either format is equally "understandable": even folk unfamiliar with datetime standards could guess at the meaning of "T". So "understandability" is not a factor.
So I think the relevant question is:
Is the relative lack of human readability in an EDTF enforced format like 2004-01-01T10:10:10+05:00
critical enough to dismiss EDTF as the strict format for biblatex (and especially in the light of my prior emphasis of the possible future importance of human readability)?
In summary it would be great if you all, Philip, Nick, Simon (if interested), or anyone else would address the question ...
Whether or not there is to be colloquial input format, which should be the strict format: EDTF or iso8601:2004?
... by answering ...
On allowing "+" sign (for years):
On changeable precision times:
On allowing a space between date and time.
2004-01-01T10:10:10+05:00
critical enough to dismiss EDTF as the strict format for biblatex (and especially in the light of my prior emphasis of the possible future importance of human readability)?I mean, for Nick, that may well essentially entail a repetition of previous answers. But I hope, at least, such repetition could be made with an increased confidence, or a new willingness to bear previously unseen difficulties, having taking into account the issues I raise.
I don't see any particular problem with any of these against EDTF. I would prefer to implement strict EDTF and no colloquial support in the core. Having said that, of course \DeclareSourcemap
can essentially massage anything into strict EDTF and so I think this is a nice solution. Whether or nor there are any driver level mappings to do this (that is, ones which come with biblatex) is another matter. I am more of a mind to leave this to style level mappings for areas which want to coerce date formats for their users just as we only support in core general style concerns and not domain-specific ones.
I will have a think about the 5.2.2 level 1 things - this might become very complicated and not worth the bother.
Not sure whether this is still work in progress, but at the moment, ifdateera
only works if datelabel=edtf
is set, although this option doesn't seem to exist (anymore) according to the manual.
@simifilm - should be stable now and is uploaded. That issue is fixed.
I am not convinced about EDTF 5.2.2 as it says "Precision for a date whose string includes the 'u' syntax assumes that the unspecified portion will eventually be supplied." and, apart from it not being clear what that means, it's arguably meaningless in a bibliography context which is usually a publishing context where nothing will be "eventually supplied". The current dev version implements strict EDTF without 5.2.2 and parses times. Most of the internals for time support are also complete. We have to decide on the core output formats for times as for dates ("long", "short", "comp" etc.).
On the other hand, I'm open to parsing 5.2.2 level 1 and putting suitable fields in the .bbl to indicate the "unspecified" status but I doubt it makes any sense to do anything with this information in standard styles as such information applies to more specialist applications such as, perhaps, archival materials.
@plk AFAICS this still needs the option datelabel=edtf
which currently is not documented. The manual only mentions iso8601
, but according to biblatex.sty
iso8601
is deprecated. And I am probably missing something, but as I said earlier, I think two things get mixed up here. ifdateera
is only available when datelabel
is set to iso8601/edtf
. But in my understanding, datelabel
defines the output. So this means that I can't use ifdateera
if I want date=long
for example.
Philip:
I don't see any particular problem with any of these against EDTF.
Yes I'm inclined to agree. Or, in other words, the problems don't outweigh the advantages of EDTF for biblatex purposes (providing a convention for approximate and uncertain dates). Specifically, ...
On the human readability of 2004-01-01T10:10:10+05:00
- well I hate reading it. But in virtue of it being "understandable" that's no great impediment in the scenario I hope biblatex can find use (as a front end format).
On handling EDTF illegal datetimes, because the time is too short, e.g. 1997-07-16T19:20+01:00
... I'll be curious to see what you come up with. But there a few options. Probably a matter left to decide once you have your hands on the code.
I would prefer to implement strict EDTF and no colloquial support in the core.
Noted. I will press the argument for colloquial support in the core, in addition to the strict (now EDTF) format, in a subsequent post. But I'll hold off for now in order to give Nick a chance to catch up. Except to say that everything I want to be expressed by a colloquial format should be expressed by the EDTF implementation. That is, to give folk the ability to ignore the colloquial format if they want.
On the issue of the conformance levels to implement for EDTF. The spec states
The specification defines three levels:
- Level 0: Features supported by 8601
- Level 1: Level 0 plus level 1 extensions
- Level 2: Level 1 plus level 2 extensions
An implementation of this specification MUST support Level 0, and MUST state which (if either) additional level (1 or 2) is supported.
So there doesn't seem to be formal scope for a partial implementation of a level.
However, I too am not sure what we need in Level 1 beyond "5.2.1 Uncertain/Approximate". And if we don't, then I don't think there's a particular problem with claiming a conformance like "EDTF level 0 plus Level 1:5.2.1 Uncertain/Approximate".
If a work was published "some time in the 13th century" it could be encoded as 12uu
as Nick suggests (from "5.2.2 Unspecified"). But that would seem to violate the rule "the 'u' syntax assumes that the unspecified portion will eventually be supplied", as you suggest. That is, if we imagine work where the lack of precision about the date of publication was established by the scholarship: there may be no expectation that the unspecified portion will be "eventually supplied".
However, "5.2.3. Extended Interval (L1)" seems able to encode "some time in the 13th century" as 1200/1299
.
"5.2.3. Extended Interval (L1)" would seem to be necessary for taking care of one the cases I first mentioned Da Vinci, Leonardo. c. 1487–1490. Codex Trivulzianus.
. That is, as 1487~\1490~
.
This seems unnecessary. The earliest writing occurred in 3200 BCE (-3199). We have a long way to go before authors need to reference works with 5 digit positive years (10000). So it seems we can get away entirely with 4 digit years. Are bibliographies sometimes exploited for listing specific fossils?
I'm not sure that seasons are necessary. Journals sometimes have "Spring" edition but always (?) can be referenced via year, optional volume number, and optional issue number.
Even if seasons do need to be expressed a format like "2001-21" looks like it might have the potential to confused as expressing an ordinal day or week in the year.
As a matter of workload it might be easier to implement "EDTF level 0 plus Level 1:5.2.1 Uncertain/Approximate" and a colloquial format (if I can convince you of this and after I suggest some modifications to the colloquial format) as a first iteration. Then do some debugging. Then have a look at the other sections in level 1, in a subsequent iteration.
Simon. In my judgement the issues you raise are roof implementation details which can't properly be attended to until the foundations are sorted out. The foundations are currently in flux.
For example it's unclear what level of EDTF ought be supported (I've expressed a view above). When that becomes clear that might have impact on the names for values. E.g. datelabel=edtflevel1
.
On the other hand Philip seems generally open to addressing these sort of issues you raise - and this is entirely a matter for Philip - the person who's coding it. So until Philip says otherwise I'd say keep those sort of suggestions coming. I just thought I'd let you know why I am not responding to them.
On EDTF 5.2.2: I read that passage differently. The main definition appears in EDTF 4.: “Unspecified: The value is unstated. It could be because the date (or part of the date) has not (yet) been assigned (it might be assigned in the future), or because it is classified, or unknown, or for any other reason.” (my emph.)
The passage “Precision for a date whose string includes the 'u' syntax assumes that the unspecified portion will eventually be supplied. Thus 199u and 19uu have year precision, 1999-uu has month precision, and 1999-01-uu and 1999-uu-uu have day precision.” on the other hand merely seems to focus on the precision that is to be ascribed to a date that contains one or more ‘u’s.
Hence I continue to feel that 19uu and 199u could very well be used as shorthands for century and decade, both of which have their role in bibliographies.
But that’s a minor issue. Apart from that: Great news.
Ok, I will look at the 5.2.2 things, there is a case for it.
@simifilm - I can't reproduce what you're seeing here - I have a test doc using dateera
with date=long
etc. and it's all fine. The doc should also have that option correctly - perhaps you have an out of date version? The latest pushed git and bundled DEV versions should have all of this - if not, let me know. Don't forget that authoryear* citations only use labelyear and so are controlled by datelabel
.
Note on current state - 3.5/2.6 currently implement all of EDTF level 1 apart from 5.2.2 - see 96-dates.tex
example file and PDF doc.
@plk I see that my .docs were out of date, but the rest should be ok.
This example does not give me negative dates:
\documentclass[a4paper]{article} \usepackage{fontspec} \usepackage[american]{babel} \usepackage{csquotes} \usepackage{filecontents} \begin{filecontents}{\jobname.bib} @book{buch, author= {Wurm, Tom}, title = {Das Buch}, date = {-2988}, location = {Die Stadt}, publisher = {Der Verlag}} \end{filecontents} \usepackage[style=authoryear,% datelabel=long, %dateuncertain=true,% %datecirca=true, backend=biber]{biblatex} \addbibresource{\jobname.bib} \begin{document} \cite{buch} \printbibliography \end{document}
EDIT: If I change datelabel
to edtf
, it works.
Something else I noticed: with negative dates, biblatex
seems to insist on 4-digits years. Something like year=-321
is not accepted.
On 5.2.2 Unspecified
Nick, I read the two passages you quoted as identifying a contradiction in the EDTF spec. One can't both stipulate "unspecified" to mean:
However, we could just ignore the contradiction and choose to interpret "u" for unspecified according to the first passage (yet to be assigned, "classified, or unknown, or for any other reason").
Essentially agreeing with you that ...
Hence ... 19uu and 199u could very well be used as shorthands for century and decade, [in addition to the other range of imprecisions specified under "5.2.2 Unspecified"]
One or more of us might find it valuable to participate in the EDTF listserve.
Nick, any thoughts on my post containing "In summary it would be great if you all, Philip, Nick, Simon (if interested), or anyone else would address the question ..."?
I propose that 5.2.2 is dealt with by expanding such strings into the appropriate date range and also marking it with a field like Xdateunspecified{Y}
where X is the date type (event, orig etc) and Y is the unspecified granularity (day, month, year, decade).
@simifilm - EDYF mandates 4 digit dates. The example you give is correct at the moment as only EDTF output has an "era neutral" output format. Other output needs to specify the era style - do you think that there should be a default?
I think a negative date should never be just simply printed as a positive date without any indication that it's actually something else.
Philip, for outputting 5.2.2 you mean something like ...
author = {Conradus Saxo}
origdate = {12uu}
title={Speculum Beatæ Mariæ Virginis}
...
=>
(Saxo 1200/1299)
% ... and so ...
199u => 1990/1999
19uu => 1900/1999
1999-uu => 1999-01/1999-12
1999-01-uu => 1999-01-01/1999-01-31
1999-uu-uu => 1999-01-01/1999-12-31
?
That looks like a consistent output scheme. If there was to be other output schemes at least the one you are suggesting looks like one that might be highly desirable. So at least yours seems worth implementing.
On outputting negative dates. I think we'd want to allow ...
-0279 (e.g. for `alldates=edtf`; and `alldates=iso8601`)
0280 BCE (for `alldates=colloquial`, or whatever the value would be).
That is, leading zero to make for 4 digits.
280 BCE (with an option for those who hate leading zeros).
Default: iso8601.
I think of the iso8601 output format as supporting space delimiters, while the EDTF output format would not. Something that only comes into play with datetimes, but might be selected for bibliographies containing both date times and BCE (negative) dates. For example someone writing a paper on Plato might reference a blog post as well as Plato's Republic.
They therefore my want to pass the option alldates=iso8601
to biblatex in order to output 2016-02-07 03:30:20 +10:00
, as in ...
Plato (-0279). Republic. Trans. by C. D. C. Reeve. 3rd edition. Indianapolis: Hackett Publishing
Company, Inc. 392 pp. isbn: 0-87220-737-4.
Priest, John (2016). 2016-02-07 03:30:20 +10:00. Philosopher of the month: Plato. url: http : / /
If they chose alldates=edtf
, by contrast, then that might want ...
Plato (-0279). Republic. Trans. by C. D. C. Reeve. 3rd edition. Indianapolis: Hackett Publishing
Company, Inc. 392 pp. isbn: 0-87220-737-4.
Priest, John (2016). 22016-02-07T03:30:20+10:00. Philosopher of the month: Plato. url: http : / /
blog.oup.com/2016/02/philosopher- of- the- month- plato/ (visited on 2016-06-13).
I mean I haven't thought much about how and where a datetime ought fit into the bibliographic entries' output. I'm not sure what the style guides say, if anything. But if we are supporting datetimes then iso8601 and edft output choices can express something different: whether space delimiters are used. Alternatively, or in addition, you may what a space delimiter option for output datetime strings.
In terms of negative years, when outputting my own documents I'd be going for 0280 BCE
, but I realize I might be freakish here. That's also why I recommend the ISO8601 default.
All that's not very thought through. It's offered as something for you to push against.
Edit: "though" to "thought". Grammar.
Is there a way besides DeclareBibliographyOption
to test whether a dateera
option was set?
Nick, I read the two passages you quoted as identifying a contradiction in the EDTF spec.
No, I don’t think it’s a contradiction. 4. provides the definition of “unspecified”; 5.2.2 merely contains a definition of the precision of strings with unspecified elements. I’d paraphrase the passage from 5.2.2 as “in order not to leave the precision of any EDTF date/time string undefined, we treat strings containing ‘u’s as if the ‘u’s had been replaced by actual digits”.
Nick, any thoughts on my post containing “In summary it would be great if you all, Philip, Nick, Simon (if interested), or anyone else would address the question …”?
No, nothing new for now, I’m afraid.
@simifilm - yes - there is a \ifdateera
test and similar tests for uncertain and circa.
@plk I think there is a misunderstanding. ifdateera
tests whether a certain date has an era set. But I am asking about the biblatex
option. Can I test whether the dateera
option of the whole document is set to secular, christian
or not at all.
@simifilm Ah, there is no particular test at the moment but if we default dateera to say, secular
anyway, there would be no point as it would always be true.
@JohnLukeBentley - yes, that would be the idea with 5.2.2 and ranges. They would also set internal field markers to differentiate from ranges which are the same but were explicitly set. This would enable a style to use, say 19uu
as "20th century" or to just use the resulting range if preferred.
I know that the implementation of this whole feature is still by no means finalised, but I noted that the changes introduced quite some machinery into the .cbx
files and more specifically the cite
bibmacros (with all the \let\ifdateera\iflabeldateera
and friends).
I'd find it conceptually neater if the date printing thingy could be dealt with further upstream in the date printing macros (maybe even in \printfield{labelyear}
and friends). That would allow for the date format to be specified at one place and would then not require changes to many basic macros. (I'm thinking about the usability of all this for custom styles as well that would have to take over the machinery as well.)
It's not finalised yet. What you see there is only for citations and only for authoryear styles. I'm not clear yet about the final organisation for citations.
@moewew - this has been redone and there are no longer any changes in the .cbx files necessary.
Dear all - I would like some feedback on the time formats to support by default. I think perhaps:
am/pm format 24h format
only? We also also have to think about time format localisation and to this end, the separator and "am", "pm" strings will be localisation strings. There will also be an option to determine timezone output format.
@plk Have you pushed the latest changes to GitHub already? ATM, I see strange things happening (for example, the era is always printed, no matter whether dateera
is set or not).
@simifilm - yes isn't that what you mentioned? So that negative dates are never made positive? Currently, dateera
defaults to secular
and therefore is always "on" but only prints something for negative dates (unless dateeraauto
is used to force it for AD/CE dates below a certain threshhold).
Ah, ok. I think it would make much more sense to print -1123
as -1123
by default – without any addition. Just print the content of the field. Also something seems to have changed about the four digits rule: Years still need to have four digits, but something like 0123
is now printed as 0123
– including the zero. This did not happen before and is a mistake IMO.
While looking at this, I realised that the datezeros
package option, the default value of which is true, was not really working properly - it suggested that it enforced leading zeros but it didn't. Now it does, for all date parts which need them. Of course, this means that we might change the default to false but then most months would be single digits which is less expectable than four digit years ...
Currently, "-" before years only happens in "edtf" output format (ex-iso8601). I suppose it could be a default but negative dates never worked at all before so there isn't much expectation at this point.
As you said, since negative dates never worked, there's nothing to break here. But I guess my expectation would be that without any option specifically activated, fields get printed more or less unaltered.
As for the datezeros
option – that explains things.
So now that we have a strict input format settled upon, EDTF, I'll revisit the argument for an additional colloquial input format.
This is only an argument about input formats, not output formats.
Firstly, indulge me specifying an example colloquial input format again, now that we have EDTF to build upon. My example will modify slightly the existing suggestions (and prior implementation).
Biblatex inputs datetime fields according to a strict format with optional colloquial alternatives.
The strict format is a EDTF string, conforming to level 1. [Explanation and examples] ....
The colloquial format offers some alternatives. There is no need to learn the colloquial format, for everything you can express in a colloquial format can be expressed as EDTF string. However, you might find working in a colloquial format easier for some purposes.
The colloquial format alternatives are only for: negative, BCE/BC years; and approximate (circa) dates.
To express a EDTF negative year in a colloquial format: minus 1; take the absolute; optionally add a space; then add a "BCE" or "BC" suffix. Keep the years as four digits.
-0379 => 0380 BCE
-0025 => 0026 BC
-1234-10-11 => 1235-10-11BCE
Using a negative sign with a "BC" or "BCE" suffix is illegal. E.g. -0379 BCE
will throw a fatal error.
Using "CE" or "AD" is illegal and will result in a fatal error.
In EDTF approximate dates are expressed with tilde ~
. The colloquial alternative is to prefix the date(time) with a "c", optionally with a space delimiter. E.g.
c -0379
c0380 BCE
c 1487/c1490
Using a tilde "~" with a circa "c" prefix is illegal. E.g. c 1230~
will throw a fatal error.
You'll have observed that some of the modifications to the previous suggestions are for the sake of simplifying and tightening up the colloquial format. Specifically in:
For @nickbart1980 is right to set store by "clarity and elegance" ...
@nickbart1980 gave a couple of arguments against a colloquial input format.
Firstly,
Allowing “colloquial” input formats would only water down biblatex’s clarity and elegance.
Not if we carefully distinguish the strict (now EDTF) format from the colloquial. Moreover, ensure that there is nothing in the colloquial format that can't also be expressed by the strict format. That allows anyone who doesn't care for the colloquial format, to ignore it. This would require being clear about the distinction in the documentation. "Here is the strict EDTF format ... that's all you need, but here are some colloquial alternatives if you find them handy ...".
Secondly,
Also, if we allow “colloquial” formats here, we’d also have to accept other “colloquial” formats like “23 Apr 2016”, “23/04/2016” and many others. I’d be strongly opposed to any of this.
That doesn't follow. You can allow some colloquial formats without allowing all colloquial formats. That is, if the basis for allowing a colloquial format is to judge each proposal on it's own merits. I too would be strongly opposed to input formats like “23 Apr 2016”, “23/04/2016”.
@plk gave a slightly different argument ...
after we throw open the doors to colloquial formats, it can never be closed
It is true that if you draw the line at a strict format (like EDTF) it is easier to point to a principle like "We don't allow colloquial input formats" as a clear rule that might dissuade others to try. However, I'd suggest there's no special difficulty in holding fast to a rule like "We don't allow colloquial input formats, that have no good reason for being".
Again, I'm thinking of biblatex's potential use as front end format, as in a single markdown + biblatex document. A context, that is, where human readability and understandability would be required. But even when biblatex entries are kept in their own seperate file, as they usually are, I'd suggest human readability and understandability is worth a great deal.
I'm less wedded to the "c" prefix approximate alternative. But the BCE/BC alternative seems important as that's how we traditionally date writing and objects in the ancient world (below year 0000/1 BCE).
An argument could be made that we ought promote the better date format, a format like EDTF that uses negative years (with a calendar that uses the year zero), and try to usurp the traditional BCE/BC scheme altogether. That is, in order that one day professors giving lectures will reference to their students Plato's Republic as being "... written about minus three seventy nine".
But even if one was committed to such a view then a transition period would be necessary as people move from their traditional way of dating, to the modern.
It is in virtue of the minus-one-take-the-absolute conversion, as from origdate={-0379}
to "380 BCE", that one would have to do when reading a biblatex file and correlating it with what one sees on the copyright page of one's copy of Plato's republic ... that makes the EDTF/ISO negative year not readily understandable - in the sense that there's an additional cognitive burden one has to take on every time you are reading the biblatex file.
Allowing origdate={0380 BCE}
, by contrast, entails that one can forgo the cognitive burden of the minus-one-take-the-absolute conversion when reading the biblatex file. And, again, this is not an argument for having the colloquial format instead of the EDTF format.
I highly recommend having the colloquial input format in addition to the EDTF input format. For negative, BCE/BC, years at least.
With the latest changes something is severely broken. No matter what I do, I get the following fatal error:
(/Users/simi/Library/texmf/tex/latex/biblatex/blx-dm.def) ! TeX capacity exceeded, sorry [input stack size=5000].
\def l.12643 bibwarn=false}
I know, it was a push just for insurance due to travel. Looking at it now.
All updated now and this draft has complete EDTF level 1 implemented. See the PDF doc for how 5.2.2 unspecified date parts work. Time format output is not done yet.
The only argument for colloquial input formats in core is one of readability (since all colloquial input can be handled by regexp sourcemaps if necessary) but I'm not sure this is enough because so many people are moving to using .bib GUI front ends which would be able to do this presentation independently of the source data. I am not sure which ones do/can as I don't use them myself but I am keen to promote the data/presentation separation which latex/biblatex encourages at all levels, including bib data/presentation.
There currently is a problem with dateera=christian
, itt produces strange errors. dateera=secular
seems to work though.
Should be fixed now.
Yeah, this is I fixed now, thank you. Sorry for bringing these things up again, but now negative dates are always printed with a minus sign. If dateera=christian
is set, the result is something like -1234 BC
which AFAIU is not valid in any case.
It's fine - useful to have some feedback. This should be fixed now too.
Now I get another error. This is the MWE
\documentclass[a4paper]{article} \usepackage{fontspec} \usepackage[american]{babel} \usepackage{csquotes} \usepackage{filecontents} \begin{filecontents}{\jobname.bib} @book{buch, author= {Wurm, Tom}, title = {Das Buch}, date = {-2988}, origdate = {-1988}, location = {Die Stadt}, publisher = {Der Verlag}} \end{filecontents} \usepackage[style=authoryear,% backend=biber]{biblatex} \addbibresource{\jobname.bib} \begin{document} \cite{buch}
\printbibliography \end{document}
And it gives me:
! Undefined control sequence.
\edef \blx@tempa {\blx@dateera@bce } l.23 \cite{buch} ? Package biblatex Warning: Bibliography string '' undefined (biblatex) at entry 'buch' on input line 23. (compiling luc: /usr/local/texlive/2016/texmf-var/luatex-cache/generic/fonts/otl /lmroman12-regular.luc)(load luc: /Users/simi/Library/texlive/2016/texmf-var/lua tex-cache/generic/fonts/otl/lmroman12-regular.luc)(compiling luc: /usr/local/tex live/2016/texmf-var/luatex-cache/generic/fonts/otl/lmroman12-bold.luc)(load luc: /Users/simi/Library/texlive/2016/texmf-var/luatex-cache/generic/fonts/otl/lmrom an12-bold.luc) ! Undefined control sequence. \edef \blx@tempa {\blx@dateera@bce } l.26 \end {document} ?
@plk Well the chief argument for the colloquial input format was understandability, rather than (human) readability, as distinguished, but I trust we mean to reference the same thing.
(And it wasn't the only argument. There was the arguments that: it doesn't prevent someone from using EDTF exclusively; and it doesn't entail that all colloquial inputs, that all individuals might suggest, must be permitted).
It is good of you to mention the data/presentation separation issue. I agree this is worth preserving.
Observe that my latest colloquial input format suggestion, unlike my initial suggestion, affords such a complete data/presentation separation. I've gotten rid of "CE"/"AD" altogether (although you may prefer to allow it, if a colloquial input format is allowed), and once you've parsed something like -0379 BCE
into the core it can be indistinguishable from a date parsed from -0380
. That is, I agree "BCE/BC" (or "CE"/"AD") in an input date ought not have any significance for the output date (beyond signifying the ordinal date). All decisions about the output format can, and ought, be made at the option stage, or style stage.
I'd also agree that most folk will continue both: to use GUI front ends to generate .bib files; and avoid reading .bib files directly. However, it is this potentially important and popular future use (and the particular reason why I'm looking at biblatex at all) for biblatex being embedded in a lightweight markup (e.g. markdown) document that would require understandability.
To be clear about the use case: when writing your markdown + biblatex document you'd still want to generate the biblatex from some GUI front end. But when you send that document as is, without transformation into pdf/html/etc, to someone who has no idea about document formats and biblatex (or even someone who does) they'll be better able to understand the document if it contains 0380 BCE
dates. (But as document author you'd be still free to use EDTF, e.g. -0379
, if you wanted to promote that format).
Recall that even if markdown + (embedded) biblatex does not quite meet Gruber's standard of "without looking like it’s been marked up", it nevertheless would meet his standard of being as "readable [understandable] as possible" and "publishable as-is, as plain text".
There's lots of reasons for working in this plain text format without, or before, transforming into a polished output format. For example you could upload your plain text document to git and invite others to downloaded it and edit it, before you pull those edits into the master and review. Or you could just email it and receive back edits in between email quoting.
On the particulars of the colloquial input format: I am not particularly wedded to the optional spaces. I'd prefer them but if that was part of your opposition then I'd be happy for you to get rid of them. So too for the "c" approximate prefix. The main thing is the "BCE" or "BC" suffix and the minus-one-absolute calculation that would entail.
In the end it's your prerogative and entirely within the realm of reasonableness for you to decide against what I suggest. However, I hope these latest points persuade you to add the colloquial input format (modified as you see fit).
@simifilm - should be fixed.
At the moment, I won't add any colloquial input formats in the core and if there is a call for this, they'll be added via a sourcemap, potentially only for certain styles which want it (and therefore potentially not in core).
@plk Works now, and I think the current behavior makes sense. Thanks a lot.
OK Philip.
Note that the potential intermediate solution you mention would offer no advantage for the use case I had in mind (markdown + biblatex), above EDTF only. For my idea is about promoting biblatex as a universal bibliographic format, in the context of that use case. That would require (output) style independence.
I mean with (markdown + biblatex (EDTF only)) I can still promote biblatex as a universal bibliographic format, for it is (output) style independent. But I predict the lack of a BCE/BC input year in the core will dissuade some potential users.
I will console myself with what you have so far achieved: the ability to handle negative years, approximates, and uncertains. And EDTF does have elegance (thanks to Nick for bringing it to our attention). So all that is marvelous.
I'll download the latest and start testing. Thanks for all your work on it so far.
Edit: added "in the core".
I request support for:
origdate={c 0125}
;origdate={c 1487/1490}
; andorigdate={1750 ?}
.It would be desirable for these kinds of values to be permissible in all date fields. I use
origdate
as the more likely example. There might be reasons for choosing different symbols for these kind of dates, to make parsing easier. There might be reasons for disallowing spaces.edit: This issue thread also combines issue Before the common era (BCE/BC) and common era (CE/AD) date support. #422 /edit
Kinds
When writing a scholarly piece there are several kinds of date ambiguities and uncertainties, these are listed below.
(Part of the power of Biblatex is in providing support for many and any type of style guide. But in the cases below I sometimes borrow from the ...
University of Chicago. 2010. The Chicago Manual of Style. 16th ed. Chicago: University of Chicago. http://www.chicagomanualofstyle.org/16/contents.html.
... because it sheds some light on these issues. I could have chosen another style guide.)
Circa dates. Where the scholarship is only able to fix a date approximately. Generally this is beyond the precision of a year, as in "c. 125 CE" (rather than at the precision of a day: we rarely see, in publishing, something like "c. 0125-02-20").
(University of Chicago 2010, under "10.43 Scholarly abbreviations", http://www.chicagomanualofstyle.org/16/ch10/ch10_sec043.html)
Circa date ranges.
No dates.
(University of Chicago 2010, under "14.152 'No date'", http://www.chicagomanualofstyle.org/16/ch14/ch14_sec152.html)
Question marked dates.
(University of Chicago 2010, under "14.152 'No date'", http://www.chicagomanualofstyle.org/16/ch14/ch14_sec152.html)
Ambiguities over which known date is relevant. For example we might have the following reference entry:
... but there is ambiguity around whether 1751 is the relevant original date, given ...
We could have chosen 1777 as the original date. But, in the case, we have reasons for choosing one date (1751) over another (1777). So in the reference entry we can add an annotation that explains all this. The annotation thereby handles the ambiguity ...
Bilatex already handles:
Biblatex also provides for date ranges.
So I request the additional functionality for:
origdate={c 0125}
;origdate={c 1487/1490}
; andorigdate={1750 ?}
.Other considerations.
Often enough there might be no semantic difference between a circa date and a question marked date. Both can be used to express a uncertainty about a date. It's rare to come across a questioned marked date, relative to a circa date. So it might be tempting to ignore question marked dates with the rule: "If you are uncertain about a date just designate it as a circa date".
However, there probably are going to be contexts, albeit rare, in which an author does want to maintain a semantic difference. E.g. For dates they are personally uncertain about, the author might tag with a question mark. For a date that the author knows a community of scholars has established as having a lack of precision, the author might tag with a circa.
On the issue of whether to support output formats "ca." or "c." as the abbreviation for circa, I'm undecided. Personally I can generally mostly recall seeing
c. 1815
and probably prefer this look, but the Chicago Manual of Style promotes 'ca.'. Perhaps both possibilities need to be supported for style authors who, in turn, provide options for users (with relevant defaults for a chosen style).