Open dfussner opened 4 years ago
Thank you very much for reporting this issue and also for the thorough investigation.
This catcode business reminds me of the terrible things I did in https://github.com/plk/biblatex/commit/9ff2cd0eed4591b449043c426f9d4e77e81321f8 to try and fix option processing under a different catcode regime
Here I'm actually inclined to include the catcode test somehwere in \blx@range@chunk@semcol
and just give up if ;
does not have the expected catcode. Retokenizing arbitrary text always feels risky to me.
I agree -- almost anything can and does appear in a postnote, and working around different catcode regimes is, as your code proves, unpleasant at best. Your solution would at least keep the data intact, and there there are several workarounds for users writing in French (or Breton):
\bibrangessep
to whatever value you need.\bibrangessep
in the postnote field, and set it to whatever value you need.Light testing suggests all of these work fine.
Turns out this is trickier than I thought. If we can't split at ;
then the code that comes later to normalise the range will not work as expected and just drop stuff. It would be a bit much to completely disable the whole feature when ;
has the wrong catcode, because then even people who never use it at all will not get the desired macro behaviour.
etoolbox
's \DeclareListParser
does not manage to split at the ;
if it has the wrong catcode. But expl3
can do it, apparently
If only a single character
<token>
is used for the split, any category code 13 (active) character matching the<token>
will be replaced before the split takes place. Spaces are trimmed at each end of each item parsed.
\documentclass[french]{article}
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage{babel}
\usepackage{xparse}
\NewDocumentCommand \foo { >{\SplitList{;}} m } { \ProcessList {#1} { \foobar } }
\begin{document}
\newcommand*{\foobar}[1]{A#1B}
\foo{1;2;3 ; 4 ; 5; 6}
\foo{1}
\end{document}
I have the impression, perhaps mistaken, that you have tried to avoid requiring xparse when developing biblatex (?) It's a big hammer for a small nut, but I've been exploring Stack Exchange and CTAN and haven't found anything that you hadn't already considered and rejected. It looks like \SplitList
would actually do the trick, but I'm sorry to admit that I haven't tried coding it up to prove it ...
It's inevitable - I have heavily used xparse
in the multiscript branch and there is no way around this I can see without horribly complicating things.
@plk The @latex3 team will be loading xparse
(or essentially all of it) as part of the LaTeX2e format from the autumn: I really would not worry overly.
The new case changing code (#1005) will use expl3
and xparse
(and as PLK mentioned, the multiscript proof-of-concept also makes heavy use of xparse
), so I think at some point we are going to move more stuff to LaTeX3. At the moment I'm trying to avoid expl3
if possible and try to separate expl3
code out, but we may find that this is not an option any more and will move more and more stuff from LaTeX2 to expl3
.
What I wouldn't find too great is if we end up with a weird mixture of all sorts of languages (LaTeX2e, expl3
) and coding styles in the biblatex
core...
When you use babel and the standard LaTeX engine (TeXLive 2020), if the "french" option is given, either as main or as a secondary language, then biblatex truncates a postnote field that contains a semicolon dividing two ranges. Biber seems to do the right thing with the same data in a pages field.
Here's a MWE:
And here's the output on my system:
The start of the problem seems to be
\initiate@active@char{;}
infrench.ldf
. The Bretonldf
file does the same thing with the same results. If you comment out that line and run LaTeX again the problem disappears;\bbl@deactivate{;}
doesn't make a difference, or perhaps I'm not using it correctly.With
\tracingmacros=2
, in my log file I get this notice when ; is activated:When ; has never been activated:
The delimited parameter list in
\blx@range@chunk@semcol
is thrown by the activated {;}, I guess, and further along it results in the loss of some of the field data. What works is to use\edef
instead of\def
when defining\abx@field@postnote
inbiblatex.sty
, but then something as simple as\textbf
in a postnote field needs to have\noexpand
. So, in\long\def\blx@defcitecmd@v
I tried a test like:in place of:
\def\abx@field@postnote{##2}
This seems to work fine, but I haven't tested it to destruction or anything. It's probable that something using
\scantokens
elsewhere in the code would be safer and more elegant, but I couldn't get it right. Any solution in this location should also be included in\citename
,\citelist
, and\citefield
, as well as (possibly?) in\blx@defvolcitepostnote
.Using
polyglossia
and XeLaTeX works fine, and I would guess babel with the same engine (or LuaTeX) would also work fine, as active characters are unnecessary there, if I'm reading the code correctly.I hope my attempts at diagnosis and cure are some help, and many thanks.