citation-style-language / schema

Citation Style Language schema
https://citationstyles.org/
MIT License
184 stars 61 forks source link

New delimiters? #338

Open denismaier opened 4 years ago

denismaier commented 4 years ago

After @jgm reporting weird delimiter behavior @bwiernik posted this suggestion:

Delimiter Example Current name New name
Delimiter between cites Jones 1992, p. 123; Smith 1994 delimiter on cs:citation delimiter on cs:citation
Delimiter between cite and locator Jones 1992,p. 123 delimiter on cs:group delimiter on cs:group
Delimiter between collapsed cites Jones 1992,1994; Smith 1994 Not explicit; will pick up group delimiter if citations are sorted collapse-delimiter
Delimiter between collapsed cites, if there is a locator Jones 1992, p. 123;1994, 1996; Smith 1994 hack of after-collapse-delimiter collapse-with-locator-delimiter
Delimiter between grouped cites, even if not collapsed (e.g., citations by the same author) Jones 1992, p. 123,Jones 1994; Smith 1994 cite-group-delimiter cite-group-delimiter
Delimiter used after a collapsed cite before the next cite (typically used if , is the regular delimiter and the collapse delimiter, but ; should be used after collapse) Jones 1992, p. 123, 1994;Smith 1994, Wilson after-collapse-delimiter after-collapse-delimiter
Delimiter between year suffixes Jones 1999a,b year-suffix-delimiter year-suffix-delimiter

I think adding collapse-delimiter and collapse-with-locator-delimiter explicitly makes the most sense and avoids trying to be too clever.

Inherited default values for the delimiters

Originally posted by @bwiernik in https://github.com/citation-style-language/test-suite/issues/36#issuecomment-667126525

bdarcus commented 4 years ago

I think adding collapse-delimiter and collapse-with-locator-delimiter explicitly makes the most sense

Ideally, we'd get more feedback. I still find it confusing (the "collapse" language).

I don't, for example, understand the practical difference between cite-group-delimiter and the proposed collapse-delimiter. The examples in the table are the same, and they appear to mean the same thing in the end.

Perhaps a different citation example would demonstrate more clearly?

Using the term "group" seems another way to describe what you all use the word "collapse" to mean. Which is maybe part of the problem; mixed logics?

But as I say, would help to get input from the developers implementing this, and maybe some more style authors.

denismaier commented 4 years ago

What do we think about adding cite-group-with-locator-delimiter?

bdarcus commented 4 years ago

Is that more clear to you, in larger relation to the other delimiters?

I'm just asking questions at this point; am not clear on the answers.

denismaier commented 4 years ago

To clarify: My question was not about your comment. I was just wondering whether we should also add cite-group-with-locator-delimiter given we already have collapse-with-locator-delimiter.

bdarcus commented 4 years ago

@fbennett - does this fit with what you were thinking with your suggested rethink of citation delimiters?

fbennett commented 4 years ago

Are there styles that vary the delimiter on a locator depending on whether it is part of a group or not? That seems a very odd design choice. Apart from that, locators are not normally set off with an implicit delimiter in CSL. Is the aim to automate punctuation preceding locators in some way?

bwiernik commented 4 years ago

It’s not the delimiter preceding the locator, it’s the delimiter between collapsed cites if there is a locator.

For example, Chicago calls for the years in collapsed author-date citations to be delimited by commas, unless there is a locator, when they are delimiter by semicolons.

So:Jones, 1990, 1992; Smith, 2000

But:Jones, 1990, p. 10; 1992, p. 15; Smith, 2000

bwiernik commented 4 years ago

@fbennett I don't think the current citeproc-js behavior is working as intended with respect to cite grouping.

The spec says:

Cite grouping can be activated by setting the cite-group-delimiter attribute or the collapse attributes on cs:citation (see also Cite Collapsing).

From the spec (and from my intuition), cite-grouping shouldn't be dependent on sorting. It should only depend on (1) whether cite-group-delimiter or collapse are set, and (2) whether the same-author citations are adjacent.

So, for Chicago (author-date), which doesn't sort, grouping its only activated when the same-author citations are adjacent (when collapsing also activates. I think the grouping-logic can parallel the collapse logic exactly. If cite-group-delimiter is set but collapse is not, then they group under the same conditions that collapse activates, but without the collapsing.

So, if this version of the chicago style has collapsing citations, then cite-group-delimter should be used between them, even though no sorting occurs.

If that's the case, we only need one new delimiter, now named: cite-group-with-locator-delimiter, to handle Chicago style:

Jones, 1990, p. 10; 1992, p. 15; Smith, 2000

fbennett commented 4 years ago

I dread digging into this area again, but if we're not conforming to style requirements, it will have to happen in citeproc-js or its successors. I guess the main ask would be for the possible combinations of special-purpose delimiters, collapsing rules, disambiguation parameters, and sorting behaviour to be clearly circumscribed and backed up by systematic tests. The code of citeproc-js around inter-cite delimiters grew by increments and has gotten pretty hard to follow, and (based on past experience) I would be fearful that further incremental changes, if not constrained by validation and exercises by a very thorough set of tests, might give rise to a flurry of bug reports from the field.

bdarcus commented 4 years ago

I decided to take a look at Chicago 17, to see how it describes all this. Some selected excerpts, that also touch on related issues:

15.23

Locator delimiters.

When a specific page, section, equation, or other division of the work is cited, it follows the date, preceded by a comma.

15.30

Two or more references in a single parenthetical citation are separated by semicolons.

... but the example in fact shows three different authors in the citation; so the semi-colons delineate authors groups, where each group has a single cite. I wish they had included more than one cite from the same author in that example.

Additional works by the same author(s) are cited by date only, separated by commas except where page numbers are required.

This is what we reinterpret as "collapsing," but note this is a within-author-group by-date delimiter.

And while I see it, here are "see also" suffixes; so suffix delimiter.

Additional references prefaced by “see also” follow any other references

Also, an interesting point on cite order, also while i"m at it:

The order in which they are given may depend on what is being cited, and in what order, or it may reflect the relative importance of the items cited. If neither criterion applies, alphabetical or chronological order may be appropriate. Unless the order is prescribed by a particular journal style, the decision is the author’s.

So Chicago isn't actually unsorted; it's that author (of the manuscript) order takes precedence on a citation-by-citation basis.

The only way to support that fully (and I don't think we should unless user requests for it) is to allow a user to tag an individual citation to preserve order.

denismaier commented 4 years ago

Side issue: I don't have Chicago Manual currently at the office, so I decided to check Turabian on this... Interestingly, this seems to be one of the places where the two manuals actually diverge. Turabian advocates to use semicolons consequently between different citations.

So, where Chicago wants:

Jones 1990, 1992; Smith 2000

Turabian requires:

Jones 1990; 1992; Smith 2000

denismaier commented 4 years ago

And while I see it, here are "see also" suffixes; so suffix delimiter.

Additional references prefaced by “see also” follow any other references

Also, these have to be excluded from automatic sorting. Are they?

bwiernik commented 4 years ago

@bdarcus I really don’t understand your objection to the term “collapsing”. It is literally what styles like APA and Chicago are doing, and I feel its interfering with our ability to discuss the substantive issue. It seems like this has bothered you for a decade. What exactly is your objection to describing adjacent same-author citations being rendered in date-only or suffix-only form as “collapsing”?

So Chicago isn't actually unsorted; it's that author (of the manuscript) order takes precedence on a citation-by-citation basis.

From the perspective of CSL, this is unsorted. “It’s sorted in the manner the writer finds meaningful” means no automatic sorting is applied. From a CSL perspective, that’s functionally the same as unsorted. By analogy, you could say the order of the bibliography in a numeric style without sort keys is not unsorted, but rather “in the order of appearance in the document as chosen by the document writer”. That’s functionally the same as the processor not sorting the bibliography at all.

denismaier commented 4 years ago

I don't have any problems with collapse. But just for the sake of it: Are there suggestions for other terms?

bdarcus commented 4 years ago

@bdarcus I really don’t understand your objection to the term “collapsing”. It is literally what styles like APA and Chicago are doing, and I feel its interfering with our ability to discuss the substantive issue.

It's just not clear to me. It doesn't match how I think about this, or what I see in style guides.

But I'm not saying not to use it; I'm suggesting:

  1. that we cross-check to make sure, per frank's concern, we aren't making this more complex than we need to.
  2. most important, if we do keep it, we be aware that I'm probably not the only one confused by it, and that we make sure to explain it in docstrings and spec, with good examples that clearly demonstrate the differences. As I say, even Chicago is unclear in at least one place.

But just for the sake of it: Are there suggestions for other terms?

No; as I say, clear documentation (well, and per frank's suggestion, tests) can probably address my concern.

zepinglee commented 2 years ago

I've just implemented the grouping and collapsing features of citeproc-lua and I generally agree with @denismaier's idea. If I understand correctly, grouping is about changing the order of cites and collapsing is about suppressing part of the contents. They are completely different procedures if not sorting is applied to cites. The examples of possible cases are listed in the following table.

Grouping Collapsing Output
- - Jones, 1992; Jones 1996; Smith 2010; Jones 1994
- Activated Jones, 1992, 1996; Smith 2010; Jones, 1994
Activated - Jones, 1992, Jones 1996, Jones, 1994; Smith 2010
Activated Activated Jones, 1992, 1996, 1994; Smith 2010

In the case of APA, the cites are firstly sorted by author names, which actually implies name grouping. Thus the distinction between grouping and collapsing is not that obvious.

Circeus commented 1 year ago

There's a final use case that is overlooked by this proposal. The delimiter between years when collapse="year-suffix" is used, that is "Jones, 1999a, b,2000a, b"