jjmccollum / context-sbl

Society of Biblical Literature (SBL) style files for ConTeXt
2 stars 0 forks source link

Add support for pagination entry field and general-purpose CSL parsing for righttext #7

Closed jjmccollum closed 2 years ago

jjmccollum commented 2 years ago

In the early development of the SBL publication support module, I attempted to reserve the use of the lefttext citation option for alternate citations (to be used, for instance, in passage citations from ancient/classical works) and that of the righttext option for page, section, paragraph, etc. numbers. This is obviously a bad decision from a portability perspective, as it prevents users from switching from some other specification to SBL with minimal changes to their TeX code. The main reason was that I wanted to avoid ambiguity between when things like page numbers were properly being cited and when the lefttext and righttext fields simply had plain textual content.

The process of cleaning up my code (not to mention the insightful comments of @denismaier!) has made it clear that some degree of parsing must be applied to the lefttext and righttext fields. Probably the best convention to implement at this stage of development is that of CSL locators (https://docs.citationstyles.org/en/stable/specification.html#locators), and in particular, the short forms defined in https://github.com/citation-style-language/locales. In line with the ConTeXt philosophy, these locators and their abbeviated forms are defined for many languages and can implemented in such a way as to facilitate multilingual usage.

As an example, if we wanted to cite a portion of the Clementine Homilies 1.3 that is found in book 8, page 223 of The Ante-Nicene Fathers, we would use something like

\cite[lefttext={See}, righttext={\altloc{bk1ch3}\loc{bk8p223} for further details}][clementinehomilies]

The parser would look for patterns beginning with the locator abbreviations (here bk for book, ch for chapter, and p for page) and contextually define the \altloc and \loc to typeset these references in a specification-dependent way. (For SBL, the \altloc output would be "1.3", while the \loc output would be "8:223".) It's also important to note that for some entry categories (like @ancienttext and @classictext), these references may not be typeset with the rest of the righttext (i.e., the "for further details" part); indeed, for the example above, they would be split into separate parts of the citation:

See The Clementine Homilies 1.3 (ANF 8:223) for further details

If no \loc or \altloc commands can be found in the righttext, then the parser should check if any prefix of the righttext is a number or range. If there is one, then it should be rendered using the appropriate default locator. Typically, this will be "page", but which locator we consider "appropriate" may be decided through a series of fallbacks, in the following order:

Any part of the righttext that is not matched by the parser in either of the previous checks should be printed as-is in the usual place for the righttext (i.e., at the end of the citation for the corresponding entry).

This is a pretty extensive proposal, and since it should be broad enough to be applicable to all bibliographic specifications (not just SBL), it would likely involve changes to code in the ConTeXt core (particularly publ-ini.lua, publ-ini.mkiv, publ-imp-cite.mkvi, and publ-imp-default.mkvi). For this reason, it may be better to create a separate repo for implementing and testing these changes. If we decide to do this, I can resolve this issue and copy its contents to the new repo.

jjmccollum commented 2 years ago

Full support for the pagination entry field will depend on whether or not the ConTeXt team is able (and willing) to implement support for specifying default field values for a given category. That said, the primary focus of this issue will be the implementation of a CSL parser.

If we assume that a CSL locator will be specified as the argument of the \loc or \altloc macro in the righttext option, then a reasonable way to proceed would be to temporarily redefine the \loc and \altloc macros according to the category of the current entry (e.g., an inreference or inbook entry would serialize the \loc argument after the shorthand of the parent reference, if there is one, while an ancienttext entry would serialize the \altloc argument after the title of the ancient text and the \loc argument in parentheses after a short citation of the cross-referenced source). These macros would parse their arguments (in lua, most likely, using a regex search for the locator abbreviations), store the parsed locator values in temporary variables accessible on the TeX side, and serialize these values appropriately according to certain checks (does the current entry have a shorthand, does the cross-referenced entry have a shorthand, does the cross-referenced entry have volume and part numbers that need to be prefixed to any cited page numbers, etc.).

Of course, it's possible that a righttext value might not include a \loc or \altloc call, so we also have to detect that up front and intelligently determine what we need to serialize in the citation. All in all, it's more complicated than I'd like it to be, and it makes me wish that ConTeXt publication support were more programmatic. But is this basically the idea you had in mind, @denismaier?

jjmccollum commented 2 years ago

Actually, a simpler and more ConTeXt-appropriate way to implement the \loc and \altloc macros would be as functions with key-value arguments:

\cite[lefttext={See}, righttext={\altloc[bk=1,ch=3]\loc[bk=8,p=223] for further details}][clementinehomilies]

This would make parsing significantly easier and less error-prone, as no complex regular expression matching would be needed, and arguments with non-numeric characters or TeX macros (e.g., p=xvii or fol={107\high{r}}) would be supported. This would also facilitate making the locator names language-specific based on the various locales at https://github.com/citation-style-language/locales, as the keys could be specified as \c!bk, \c!ch, \c!col, etc. and read in a language-dependent way.

If that sounds good, then I'll attempt to proceed this way. I'm pretty sure that the starttexdefinition ... stoptexdefinition way of defining functions supports key-value arguments, and if so, it would be my preferred way of doing this, but I'll have to do some digging and maybe send an e-mail about this to the mailing list. The ConTeXt Garden wiki entry at https://wiki.contextgarden.net/starttexdefinition includes an example that looks like

%% \def\command#name#options{…}
\starttexdefinition command #name #options
    …
\stoptexdefinition

But it's not clear to me what #name and #options are supposed to be. The namespace and the key-value pairs?

jjmccollum commented 2 years ago

@denismaier, do you think it would be good to support an optional unnamed argument for \loc and \altloc, as well, so that something like

\cite[lefttext={See}, righttext={\altloc{1.3}\loc{8.223} for further details}][clementinehomilies]

would still be acceptable?

jjmccollum commented 2 years ago

Okay, I can now get the following MWE working with minor modifications:

% macros=mkvi

\starttexdefinition unexpanded loc [#1]
    \doifassignmentelse{#1} {
        % TODO: the parameters should be prefixed with \c! for multilingual support; 
        % their equivalents in other languages should be assigned according to the mappings at https://github.com/citation-style-language/locales
        \getemptyparameters[btxloc][
            bk=,
            chap=,
            col=,
            fig=,
            fol=,
            no=,
            l=,
            n=,
            op=,
            p=,
            para=,
            pt=,
            sec=,
            sv=,
            v=,
            vol=
        ]
        \getparameters[btxloc][
            #1% this will overwrite the empty defaults for all locators that are specified
        ]
        <Citation>\btxcomma% placeholder for the citation
        \doifnot{\btxlocbk}{} {
            book \btxlocbk\btxcomma
        }
        \doifnot{\btxlocchap}{} {
            chapter \btxlocchap\btxcomma
        }
        \doifnot{\btxloccol}{} {
            column \btxloccol\btxcomma
        }
        \doifnot{\btxlocfig}{} {
            figure \btxlocfig\btxcomma
        }
        \doifnot{\btxlocfol}{} {
            folio \btxlocfol\btxcomma
        }
        \doifnot{\btxlocno}{} {
            number \btxlocno\btxcomma
        }
        \doifnot{\btxlocl}{} {
            line \btxlocl\btxcomma
        }
        \doifnot{\btxlocn}{} {
            note \btxlocn\btxcomma
        }
        \doifnot{\btxlocop}{} {
            opus \btxlocop\btxcomma
        }
        \doifnot{\btxlocp}{} {
            page \btxlocp\btxcomma
        }
        \doifnot{\btxlocpara}{} {
            ¶ \btxlocpara\btxcomma
        }
        \doifnot{\btxlocpt}{} {
            part \btxlocpt\btxcomma
        }
        \doifnot{\btxlocsec}{} {
            § \btxlocsec\btxcomma
        }
        \doifnot{\btxlocsv}{} {
            sub verbo \btxlocsv\btxcomma
        }
        \doifnot{\btxlocv}{} {
            verse \btxlocv\btxcomma
        }
        \doifnot{\btxlocvol}{} {
            volume \btxlocvol\btxcomma
        }
        \removeunwantedspaces
        \removepunctuation
        \btxperiod
    } {
        % If the input is just a value and not an assignment list,
        % then print it after the citation without reformatting anything.
        % Default pagination settings would be enforced here.
        <Citation>\btxcomma #1\btxperiod
    }
\stoptexdefinition

\starttext
    \loc[bk=1,chap=3,para=4,sec={1.3}]\blank
    \loc[1.3]\blank
\stoptext

The only difference from the examples described above is that the input to \loc or \altloc (whether a proper locator assignment list or a raw input) is always specified in square brackets. I've also changed ch=3 from the above examples to chap=3, as that is the proper CSL abbreviation for "chapter."

denismaier commented 2 years ago

do you think it would be good to support an optional unnamed argument for \loc and \altloc, as well, so that something like

\cite[lefttext={See}, righttext={\altloc{1.3}\loc{8.223} for further details}][clementinehomilies]

would still be acceptable?

Yes, I think unnamed arguments would be a good idea. Maybe we need some sort of a fallback cascade. Like so:

righttext = {5}
righttext = {\loc[5]}
righttext = {\loc[p.5]}
jjmccollum commented 2 years ago

I had the same thing in mind. In the MWE above, the \doifassignmentelse block will parse \loc[p=5] (the preferred syntax) one way and will use default pagination settings for \loc[5]. For the first scenario where \loc is not invoked, I'll need to do some sort of doifinstringelse check (without expanding the \loc command in the righttext argument) to handle cases where it is invoked and where it isn't invoked separately.

jjmccollum commented 2 years ago

@denismaier I've discussed the issue of parsing the locators on the mailing list, and Hans has suggested

just redefine \loc on the fly depending on where it's used and/or use keys

\cite[lefttext={See},volume=8,page=223]

or so .. imo parsing content is not really a good solution and probably also not reliable

I agree with him that parsing is neither easy nor reliable for this purpose, so I think a preferable approach would be to give the \loc macro a single definition (structured like the definition a few comments ago) and then allow the user to specify it in the setup for the citation as follows:

\cite[lefttext={See},altloctext={\loc[1.3]},loctext={\loc[vol=8,p=223]},righttext={for further details}][clementinehomilies]

In the above example, both uses of the \loc macro (with a single argument or with a key-value assignment) are supported, but raw values would also be supported:

\cite[lefttext={See},altloctext={1.3},loctext={8:223},righttext={for further details}][clementinehomilies]

To implement this feature in the more succinct autocite, footcite, inlinecite, parencite macros, we would just have to allow for two extra optional arguments corresponding to the loc and altloc arguments above:

\autocite[lefttext][altloc][loc][righttext][tag]

Or, for the example above, either

\autocite[See][1.3][vol=8,p=223][for further details][clementinehomilies]

or

\autocite[See][1.3][8:223][for further details][clementinehomilies]
denismaier commented 2 years ago

Hmm, but what if you want output like this:

"See Doe, Title, p. 8; but there are also contradictory statements, e.g. p. 12."

I.e., how would you mix narrative content and locator information?

jjmccollum commented 2 years ago

In order to support the usage with ConTeXt's native \cite macro, I have to expose the \loc macro to users anyway, so you could achieve this in one of two ways:

\cite[lefttext={See},loctext={\loc[p=8]},righttext={; but there are also contradictory statements, e.g. \loc[p=12].}][Doe:Title]

with the native \cite macro, or

\autocite[See][][p=8][; but there are also contradictory statements, e.g. \loc[p=12].]{Doe:Title}

with the \autocite variant. (The second set of brackets is empty because there is no alternate locator.)

In fact, you should be able to use the \loc macro anywhere in your text, not just in citations.

jjmccollum commented 2 years ago

All right, I have a working standalone script for the \loc macro with SBL settings!

% macros=mkvi

\setupbodyfont[ebgaramond, 12pt]

\setupbtxlabeltext
  [en]
  [
  sbl:ibid={ibid.},
  sbl:idem={idem},
  sbl:edition={ed.},
  sbl:editionfull={edition},
  sbl:editor={ed.},
  sbl:editors={eds.},
  sbl:Page={Page},
  sbl:Pages={Pages},
  sbl:here={here},
  sbl:edited={edited},
  sbl:Edited={Edited},
  sbl:translator={trans.},
  sbl:translated={translated},
  sbl:Translated={Translated},
  sbl:volume={vol.},
  sbl:Volume={Vol.},
  sbl:volumes={vols.},
  sbl:Volumes={Vols.},
  sbl:part={part},
  sbl:parts={parts},
  sbl:masters={Master's},
  sbl:thesis={thesis},
  sbl:phd={PhD},
  sbl:diss={diss.},
  sbl:reprint={repr.},
  sbl:Reprint={Repr.},
  sbl:paperpresentedat={paper presented at},
  sbl:Paperpresentedat={Paper presented at},
  sbl:with={with},
  sbl:by={by},
  sbl:in={in},
  sbl:In={In},
  sbl:of={of},
  sbl:to={to},
  sbl:released={released},
  sbl:Released={Released},
  sbl:reviewof={review of},
  sbl:Reviewof={Review of},
  sbl:translationof={trans. of},
  sbl:Translationof={Translation of},
  sbl:reprintof={repr. of},
  sbl:Reprintof={Reprint of},
  % locator abbreviations (§8.1.3)
  sbl:loc:book={bk.},
  sbl:loc:books={bks.},
  sbl:loc:chapter={ch.},
  sbl:loc:chapters={chs.},
  sbl:loc:column={col.},
  sbl:loc:columns={cols.},
  sbl:loc:episode={ep.},
  sbl:loc:episodes={eps.},
  sbl:loc:figure={fig.},
  sbl:loc:figures={figs.},
  sbl:loc:folio={fol.},
  sbl:loc:folios={fols.},
  sbl:loc:fragment={frag.},
  sbl:loc:fragments={frags.},
  sbl:loc:line={line},
  sbl:loc:lines={lines},
  sbl:loc:note={n.},
  sbl:loc:notes={nn.},
  sbl:loc:number={no.},
  sbl:loc:numbers={nos.},
  sbl:loc:opus={op.},
  sbl:loc:opera={opp.},
  sbl:loc:page={p.},% this is not used in practice, as page numbers are printed without a locator prefix
  sbl:loc:pages={pp.},% this is not used in practice, as page numbers are printed without a locator prefix
  sbl:loc:paragraph={¶},
  sbl:loc:paragraphs={¶¶},
  sbl:loc:part={pt.},
  sbl:loc:parts={pts.},
  sbl:loc:plate={pl.},
  sbl:loc:plates={pls.},
  sbl:loc:recension={rec.},
  sbl:loc:recensions={recs.},
  sbl:loc:section={§},
  sbl:loc:sections={§§},
  sbl:loc:subverbo={s.v.},
  sbl:loc:subverbi={s.vv.},
  sbl:loc:verse={v.},
  sbl:loc:verses={vv.},
  sbl:loc:volume={vol.},
  sbl:loc:volumes={vols.}
  ]

% Various Lua macros for parsing and conversion
\startluacode
  publications.btx = publications.btx or {}

  local sbl = {}
  publications.btx.sbl = sbl

  function sbl.isrange(str)
    if string.find(str, "-") then
      return true
    end
    -- TODO: this may be language-dependent, as some languages may use commas instead of periods as section/subsection separators
    if string.find(str, ",") then
      return true
    end
    return false
  end

  function sbl.doifrange(str, ifyes)
    if string.find(str, "-") then
      context(ifyes)
    end
    -- TODO: this may be language-dependent, as some languages may use commas instead of periods as section/subsection separators
    if string.find(str, ",") then
      context(ifyes)
    end
  end

  function sbl.doifrangeelse(str, ifyes, ifno)
    if sbl.isrange(str) then
      context(ifyes)
    else
      context(ifno)
    end
    return
  end

  function sbl.regexsplit(str, regex)
    -- by default, split on whitespace
    if regex == nil then
      regex = "%s+"
    end
    local splits = {}
    local i = 1
    local next_start = 0
    local next_end = 0
    while true do
      -- get the next occurrence of the separator pattern
      next_start, next_end = string.find(str, regex, i) -- find next occurrence of separator pattern
      if next_start == nil then
        -- if there isn't one, then add the suffix of the input string starting at the current index and exit the loop
        table.insert(splits, string.sub(str, i, string.len(str)))
        break
      else
        -- if there is, then add the substring between the current index and the start of the match,
        -- then update the current index
        table.insert(splits, string.sub(str, i, next_start - 1))
        i = next_end + 1
      end
    end
    return splits
  end

  function sbl.abbreviatepagerange(pages)
    -- if this is not a page range, then print it as-is
    if string.count(pages, "%-%-") == 0 then
      context(pages)
      return
    end
    -- if the page range is prefixed with volume and part numbers, followed by a colon 
    -- then recursively process the actual page range 
    -- and add the prefix back when we're done
    if string.count(pages, ":") > 0 then
      local volume_prefix = string.split(pages, ":")[1]
      local pages_in_volume = string.split(pages, ":")[2]
      context("%s:", volume_prefix)
      sbl.abbreviatepagerange(pages_in_volume)
      return
    end
    -- if this is a disjoint set of page ranges separated by commas (and possibly whitespace),
    -- then recursively convert each one, separating them by a comma and a single space
    local page_ranges = sbl.regexsplit(pages, ",%s*")
    if #page_ranges > 1 then
      for i=1,#page_ranges do
        sbl.abbreviatepagerange(page_ranges[i])
        if i < #page_ranges then
          context(", ")
        end
      end
      return
    end
    -- if this is a single page range, then truncate according to Chicago/SBL's horrendous rules
    -- (§4.2.3)
    local start_page = string.split(pages, "--")[1]
    local end_page = string.split(pages, "--")[2]
    -- if either page number is not an Arabic numeral (e.g., Roman numerals),
    -- then print the range as-is
    if not tonumber(start_page) or not tonumber(end_page) then
      context(pages)
      return
    end
    -- otherwise, if the second page number is smaller than the first page number, then print the range as-is 
    -- (this shouldn't happen unless users are already applying this rule)
    if tonumber(end_page) < tonumber(start_page) then
      context(pages)
      return
    end
    -- otherwise, if the second page number is equal to the first page number, then the "range" is just a single page;
    -- print just the first page number
    if tonumber(end_page) == tonumber(start_page) then
      context(start_page)
      return
    end
    -- otherwise, if the second page number has more digits than the first page number, then print the range as-is
    if string.len(end_page) > string.len(start_page) then
      context(pages)
      return
    end
    -- otherwise, if both numbers are fewer than three digits, then print the range as-is
    if string.len(start_page) < 3 then
      context(pages)
      return
    end
    -- otherwise, if the first page number ends in 00, then print the range as-is
    if string.sub(start_page, -2, -1) == "00" then
      context(pages)
      return
    end
    -- otherwise, if both page numbers end in 01-09, then only use the last digit of the end page in the range
    if tonumber(string.sub(start_page, -2, -1)) < 10 and tonumber(string.sub(end_page, -2, -1)) < 10 then
      context("%s--%s", start_page, string.sub(end_page, -1, -1))
      return
    end
    -- otherwise, find the first digit at which the two numbers differ 
    -- and only use the suffix starting from that position of the end page in the range
    -- unless it is the last position, in which case use the last two digits
    local index = 1
    while string.sub(start_page, 1, index) == string.sub(end_page, 1, index) do
      index = index + 1
    end
    if index == string.len(start_page) then
      index = index - 1
    end
    context("%s--%s", start_page, string.sub(end_page, index, -1))
    return
  end
\stopluacode

\def\doifrange#1#2{\ctxlua{publications.btx.sbl.doifrange([==[#1]==],[==[#2]==])}}
\def\doifrangeelse#1#2#3{\ctxlua{publications.btx.sbl.doifrangeelse([==[#1]==],[==[#2]==],[==[#3]==])}}
\def\abbreviatepagerange#1{\ctxlua{publications.btx.sbl.abbreviatepagerange([==[#1]==])}}

\starttexdefinition loc [#1]
  \doifassignmentelse{#1} {
    \begingroup
    \def\previouspunct{no}% if any locator before a section or paragraph locator is included, set to yes
    % TODO: all parameters should be prefixed with \c! for multilingual support; 
    % their equivalents in other languages should be assigned according to the mappings at https://github.com/citation-style-language/locales
    \getparameters[btxsblloc][
      #1% this will overwrite the empty defaults for all locators that are specified
    ]
    % Was a volume number specified?
    \doifdefined{btxsbllocvol} {
      % Then typeset the number without a CSL abbreviation, followed by a comma
      \btxsbllocvol
      \btxcomma
      % If a page number is also specified, then replace the comma with a colon without surrounding space
      \doifdefined{btxsbllocp} {
        \removeunwantedspaces
        \removepunctuation
        \btxcolon
        \removeunwantedspaces
      }
      % If a part is also specified, then replace the comma with a period without surrounding space
      \doifdefined{btxsbllocpt} {
        \removeunwantedspaces
        \removepunctuation
        \btxperiod
        \removeunwantedspaces
      }
      % If a number is also specified, then replace the comma with a period without surrounding space
      \doifdefined{btxsbllocno} {
        \removeunwantedspaces
        \removepunctuation
        \btxperiod
        \removeunwantedspaces
      }
      \def\previouspunct{yes}
    }
    % Was a number specified?
    \doifdefined{btxsbllocno} {
      % Was a volume number also specified?
      \doifdefinedelse{btxsbllocvol} {
        % If so, then typeset the number without a CSL abbreviation, followed by a comma
        \btxsbllocno
        \btxcomma
        % If a page number is also specified, then replace the comma with a colon without surrounding space
        \doifdefined{btxsbllocp} {
          \removeunwantedspaces
          \removepunctuation
          \btxcolon
          \removeunwantedspaces
        }
        % If a part is also specified, then replace the comma with a period without surrounding space
        \doifdefined{btxsbllocpt} {
          \removeunwantedspaces
          \removepunctuation
          \btxperiod
          \removeunwantedspaces
        }
      } {
        % Otherwise, typeset the number with its CSL abbreviation, followed by a comma
        \doifrangeelse{\btxsbllocno} {
          \btxlabeltext{sbl:loc:numbers}
        } {
          \btxlabeltext{sbl:loc:number}
        }
        \btxsbllocno
        \btxcomma
      }
      \def\previouspunct{yes}
    }
    % Was a part number specified?
    \doifdefined{btxsbllocpt} {
      % Was a volume number also specified?
      \doifdefinedelse{btxsbllocvol} {
        % If so, typeset the number without a CSL abbreviation, followed by a comma
        \btxsbllocpt
        \btxcomma
        % If a page number is also specified, then replace the comma with a colon without surrounding space
        \doifdefined{btxsbllocp} {
          \removeunwantedspaces
          \removepunctuation
          \btxcolon
          \removeunwantedspaces
        }
      } {
        % Otherwise, typeset the number with its CSL abbreviation, followed by a comma
        \doifrangeelse{\btxsbllocpt} {
          \btxlabeltext{sbl:loc:parts}
        } {
          \btxlabeltext{sbl:loc:part}
        }
        \btxsbllocpt
        \btxcomma
      }
      \def\previouspunct{yes}
    }
    % Was a page number specified?
    \doifdefined{btxsbllocp} {
      % Typeset the number without a CSL abbreviation, followed by a comma
      \abbreviatepagerange{\btxsbllocp} % abbreviate the page range according to SBL/Chicago rules
      \btxcomma
      % If a note number is also specified, then replace the comma with a space
      \doifdefined{btxsbllocn} {
        \removeunwantedspaces
        \removepunctuation
        \btxspace
      }
    }
    % Was a footnote number specified?
    \doifdefined{btxsbllocn} {
      % Then typeset the number with its CSL abbreviation, followed by a comma
      \doifrangeelse{\btxsbllocn} {
        \btxlabeltext{sbl:loc:notes}
      } {
        \btxlabeltext{sbl:loc:note}
      }
      \btxspace
      \btxsbllocn
      \btxcomma
      \def\previouspunct{yes}
    }
    % Was a figure number specified?
    \doifdefined{btxsbllocfig} {
      % Then typeset the number with its CSL abbreviation, followed by a comma
      \doifrangeelse{\btxsbllocfig} {
        \btxlabeltext{sbl:loc:figures}
      } {
        \btxlabeltext{sbl:loc:figure}
      }
      \btxspace
      \btxsbllocfig
      \btxcomma
      \def\previouspunct{yes}
    }
    % Was an opus number specified?
    \doifdefined{btxsbllocop} {
      % Then typeset the number with its CSL abbreviation, followed by a comma
      \doifrangeelse{\btxsbllocop} {
        \btxlabeltext{sbl:loc:opera}
      } {
        \btxlabeltext{sbl:loc:opus}
      }
      \btxspace
      \btxsbllocop
      \btxcomma
      \def\previouspunct{yes}
    }
    % Was a book number specified?
    \doifdefined{btxsbllocbk} {
      % Then typeset the number with its CSL abbreviation, followed by a comma
      \doifrangeelse{\btxsbllocbk} {
        \btxlabeltext{sbl:loc:books}
      } {
        \btxlabeltext{sbl:loc:book}
      }
      \btxspace
      \btxsbllocbk
      \btxcomma
      \def\previouspunct{yes}
    }
    % Was an episode number specified?
    \doifdefined{btxsbllocep} {
      % Then typeset the number with its CSL abbreviation, followed by a comma
      \doifrangeelse{\btxsbllocep} {
        \btxlabeltext{sbl:loc:episodes}
      } {
        \btxlabeltext{sbl:loc:episode}
      }
      \btxspace
      \btxsbllocep
      \btxcomma
      \def\previouspunct{yes}
    }
    % Was a chapter number specified?
    \doifdefined{btxsbllocchap} {
      % Then typeset the number with its CSL abbreviation, followed by a comma
      \doifrangeelse{\btxsbllocchap} {
        \btxlabeltext{sbl:loc:chapters}
      } {
        \btxlabeltext{sbl:loc:chapter}
      }
      \btxspace
      \btxsbllocchap
      \btxcomma
      \def\previouspunct{yes}
    }
    % Was a verse number specified?
    \doifdefined{btxsbllocv} {
      % Then typeset the chapter number with its CSL abbreviation, followed by a comma
      \doifrangeelse{\btxsbllocv} {
        \btxlabeltext{sbl:loc:verses}
      } {
        \btxlabeltext{sbl:loc:verse}
      }
      \btxspace
      \btxsbllocv
      \btxcomma
      \def\previouspunct{yes}
    }
    % Was a folio number specified?
    \doifdefined{btxsbllocfol} {
      % Then typeset the number with its CSL abbreviation, followed by a comma
      \doifrangeelse{\btxsbllocfol} {
        \btxlabeltext{sbl:loc:folios}
      } {
        \btxlabeltext{sbl:loc:folio}
      }
      \btxspace
      \btxsbllocfol
      \btxcomma
      \def\previouspunct{yes}
    }
    % Was a fragment number specified?
    \doifdefined{btxsbllocfrag} {
      % Then typeset the number with its CSL abbreviation, followed by a comma
      \doifrangeelse{\btxsbllocfrag} {
        \btxlabeltext{sbl:loc:fragments}
      } {
        \btxlabeltext{sbl:loc:fragment}
      }
      \btxspace
      \btxsbllocfrag
      \btxcomma
      \def\previouspunct{yes}
    }
    % Was a plate number specified?
    \doifdefined{btxsbllocpl} {
      % Then typeset the number with its CSL abbreviation, followed by a comma
      \doifrangeelse{\btxsbllocpl} {
        \btxlabeltext{sbl:loc:plates}
      } {
        \btxlabeltext{sbl:loc:plate}
      }
      \btxspace
      \btxsbllocpl
      \btxcomma
      \def\previouspunct{yes}
    }
    % Was a column number specified?
    \doifdefined{btxsblloccol} {
      % Then typeset the number with its CSL abbreviation, followed by a comma
      \doifrangeelse{\btxsblloccol} {
        \btxlabeltext{sbl:loc:columns}
      } {
        \btxlabeltext{sbl:loc:column}
      }
      \btxspace
      \btxsblloccol
      \btxcomma
      \def\previouspunct{yes}
    }
    % Was a line number specified?
    \doifdefined{btxsbllocl} {
      % Then typeset the number with its CSL abbreviation, followed by a comma
      \doifrangeelse{\btxsbllocl} {
        \btxlabeltext{sbl:loc:lines}
      } {
        \btxlabeltext{sbl:loc:line}
      }
      \btxspace
      \btxsbllocl
      \btxcomma
      \def\previouspunct{yes}
    }
        % Was a sub verbo entry specified?
    \doifdefined{btxsbllocsv} {
      % Then typeset the number with its CSL abbreviation, followed by a comma
      \doifrangeelse{\btxsbllocsv} {
        \btxlabeltext{sbl:loc:subverbi}
      } {
        \btxlabeltext{sbl:loc:subverbo}
      }
      \btxspace
      \btxsbllocsv
      \btxcomma
      \def\previouspunct{yes}
    }
    % Was a section number specified?
    \doifdefined{btxsbllocsec} {
      % If any of the preceding keys was specified, then replace any preceding punctuation with a space
      % and typeset the number with its CSL abbreviation, followed by a comma
      \doif{\previouspunct}{yes} {
        \removeunwantedspaces
        \removepunctuation
        \btxspace
      }
      \doifrangeelse{\btxsbllocsec} {
        \btxlabeltext{sbl:loc:sections}
      } {
        \btxlabeltext{sbl:loc:section}
      }
      \btxspace
      \btxsbllocsec
      \btxcomma
      \def\previouspunct{yes}
    }
    % Was a paragraph number specified?
    \doifdefined{btxsbllocpara} {
      % If any of the preceding keys was specified, then replace any preceding punctuation with a space
      % and typeset the number with its CSL abbreviation, followed by a comma
      \doif{\previouspunct}{yes} {
        \removeunwantedspaces
        \removepunctuation
        \btxspace
      }
      \doifrangeelse{\btxsbllocpara} {
        \btxlabeltext{sbl:loc:paragraphs}
      } {
        \btxlabeltext{sbl:loc:paragraph}
      }
      \btxspace
      \btxsbllocpara
      \btxcomma
      \def\previouspunct{yes}
    }
    \endgroup
  } {
    % If the input is just a value and not an assignment list,
    % then print it with minimal reformatting.
    % TODO: Category-dependent default pagination settings would be enforced here.
    #1
  }
  \removeunwantedspaces
  \removepunctuation
\stoptexdefinition

\starttext
    \loc[p={210--220}]\blank
    \loc[vol=8,p=223]\blank
    \loc[p=xii,n={4, 7}]\blank
    \loc[vol=2,pt=1,p=12,fig=3]\blank
    \loc[fol={212\high{r}},col=2,l={18--20}]\blank
    \emph{CAD} \loc[vol=20, sv={\quotation{ubšukkinakku}}]\blank
    LEH, \loc[sv={\quotation{ἐνθύημα,} \quotation{λεαίνω.}}]\blank
    \loc[pl=5,col=2,l=20]--\loc[col=3,l=2]\blank
    \loc[1.3]\blank
\stoptext
jjmccollum commented 2 years ago

Okay, CSL locator support should be fully implemented in the latest push, so I'm closing this issue. Some minor fixes will probably be necessary to make sure that some of the above examples work as expected (e.g., in the "Doe:Title" example, checks for whether the righttext starts with a punctuation mark to ensure that spaces don't precede it in this case), but these are easy to resolve and do not fall under the scope of this issue.