bdarcus / csl-next

An experimental reimagining of CSL
Mozilla Public License 2.0
8 stars 0 forks source link

Add sorting, grouping, disambiguation #121

Closed bdarcus closed 1 year ago

bdarcus commented 1 year ago

This is absolutely crucial functionality to get right, and is the next important step.

In doing so, also assess, and if necessary refine, the Style model for these.

See also #44, #22

Sorting

Requirements

The current Style model assumes separate configurable keys.

  sort:
    - key: author # remember, substitution applies here
      order: ascending
    - key: year
      order: ascending

This is the most general and flexible approach from this angle, and should be transparent to users in some GUI representation of it.

So I'm confident this should be ideal approach.

Strategies

If faithfully translated into code, it would suggest a parameter that is exactly that: an array of such objects, and so a multi-key sort.

Since I'm a mediocre JavaScript programmer, I couldn't figure out how to elegantly do this, so that's really my only question.

This is as far as I got initially, which is a fixed author-date sorting function:

function sortReferences (references: Reference[]) {
  // TODO generalize; this is a fixed sorting function for author-date
  return references.sort((a,b) => 
  (a.authorSortString().localeCompare(b.authorSortString()) || b.issued - a.issued));
}

See:

https://gist.github.com/bdarcus/59d6d90783f29511a6551a19b7fca7bb

So a simple, and more limited, approach would just be:

sort: author-date

But I'd prefer not to give up on the more flexible and general approach just yet.

Grouping and disambiguation

Requirements, etc.

Here's the current model config options for the group keys:

https://github.com/bdarcus/csl-next/blob/e0965d82bcccbc8babb5157cd5d6c4d8987663f8/src/style/options.ts#L17

Example:

  group:
    # be explicit about grouping, which is core logic
    - author
    - year

... which should produce something like:

Doe (2022a, b) said X (see also Jones 2021c).

We need to:

  1. get the year suffix from the sorted list of all cited references, which corresponds to an index within an author-year group.
  2. based on Style config, omit the author name group label when more than one item in the group. Put differently, in this case, only print the group name.

Also worth noting that there are different grouping logics. For example, when used "as-cited" as the key for a bibliography, my thought is it should generate something like:

https://tex.stackexchange.com/questions/171175/biblatex-mcite-add-arbitrary-text-in-references-with-subentries

image

EDIT: I guess that may be more a (end)note-cite than bibliography, particularly when considering this other example they include?

image

Either way, 1.0 doesn't support this sort of style, but I think we can fairly easily (though configuration needs more thought)?

Strategies

My hunch is it's better the code faithfully match my above description, as it's likely to be more clear, and perhaps easier to manage.

But I think existing 1.0 processors do disambiguation differently?

If grouping can also have an array of arbitrary keys, however, that raises some implementation and Style design questions.

bdarcus commented 1 year ago

Closed via #129, #132, #133

The grouping logic is added to the procHints property.

  ProcReference {
    data: {
      type: "article",
      title: "The Title",
      author: [ { name: "United Nations", parse: false } ],
      issued: "2020",
      citekey: "un"
    },
    procHints: { groupIndex: 1, groupKey: "United Nations:2019", groupLength: 1 }
  }

... and then incorporated in rendering:

  [
    [ { contributors: "author", procValue: "Doe, Jane" } ],
    {
      date: "issued",
      format: "year",
      wrap: "parentheses",
      procValue: "2022b"
    },
    [ { title: "title", procValue: "The Title" } ],
    undefined,
    undefined
  ],

EDIT: just noticed a bug.