bdarcus / csln

Reimagining CSL
Mozilla Public License 2.0
13 stars 0 forks source link

Refactor sort config #61

Closed bdarcus closed 1 year ago

bdarcus commented 1 year ago

Per discussion on a CSL 1.0 test, some styles require author sort keys to be shortened as they are for display.

https://github.com/citation-style-language/test-suite/issues/60

So a vector/list of sort structs isn't enough, or the sort struct needs another parameter.

In looking through the style repo with the blunt instrument of ripgrep, here's some conclusions:

  1. by far the most common primary sorting key is author, which is subject to shortening for display, and to substitution when not present
  2. per the linked test, some styles want you to sort on the shortened list
  3. almost without exception, substitutions for author are editor, title, translator.
  4. two of those substitutions are contributor/names lists, so also subject to shortening

I've already extracted substitution and name list shortening to top-level config options.

So I think a small change like the following should work?

sort:
  shorten_author: true
  specs:
    - key: author
    - key: issuedYear
      order: descending

... or even:

sort:
  shorten_author: true
  keys:
    - author
    - issued.year-descending

Here's how biblatex does it, which is similar to my last option, but simpler, yet more options (see section 3.1.2 in general):

image

But note that it has minsortnames and variants, which means it's not just a boolean controlling the linked case. From the manual:

The first item considered in the sorting process is always the presort field of
the entry. If this field is undefined, biblatex will use the default value ‘mm’ as
a presort string. The next item considered is the sortkey field. If this field is
defined, it serves as the master sort key. Apart from the presort field, no further
data is considered in this case. If the sortkey field is undefined, sorting continues
with the name. The package will try using the sortname, author, editor, and
translator fields, in this order. Which fields are considered also depends on the
setting of the use<name> options. If all such options are disabled, the sortname
field is ignored as well. Note that all name fields are responsive to maxnames and
minnames. If no name field is available, either because all of them are undefined
or because all use<name> options are disabled, biblatex will fall back to the
sorttitle and title fields as a last resort. The remaining items are, in various
order: the sortyear field, if defined, or the first four digits of the year field
otherwise; the sorttitle field, if defined, or the title field otherwise; the
volume field.

So do something like they did, it might:

sort:
  rules: nty
  shorten:
    min: 4
    take: 2

E.g. would need to allow shorten in multiple places.

bdarcus commented 1 year ago

On biblatex, here's how they define sorting in the code:

https://github.com/plk/biblatex/blob/87e29698dd55876aad48e46e3b5f630a307e899e/tex/latex/biblatex/biblatex.def#L1458-L1495

This is the default template:

\DeclareSortingTemplate{nty}{
  \sort{
    \field{presort}
  }
  \sort[final]{
    \field{sortkey}
  }
  \sort{
    \field{sortname}
    \field{author}
    \field{editor}
    \field{translator}
    \field{sorttitle}
    \field{title}
  }
  \sort{
    \field{sorttitle}
    \field{title}
  }
  \sort{
    \field{sortyear}
    \field{year}
  }
  \sort{
    \field{volume}
    \literal{0}
  }
}
bdarcus commented 1 year ago

Closed via #69