Closed gvdgdo closed 9 years ago
@josephwright - perhaps you could look at \rangelen
? I can't work out why the argument parsing is ok with \rangelen{10\bibrangedash 15}
but breaks with \rangelen{\thefield{pages}}
when the pages
field contains 10\bibrangedash 15
.
I can easily set up to get \thefield{pages}
to expand, but there is an issue I need to follow further. As noted in the code, \rangelen
has to be expandable, but the current definition uses \blx@imc@ifinteger
which is not. I'll check the logs to see if this has always been true or if not when/why it was introduced.
To be fully expandable, everything used by \rangelen
must be expandable. That means using an expandable integer test and avoiding any other non-expandable code. At the same time, you need to ensure that the argument to \rangelen
is fully expanded. That leads to an implementation something like:
\newcommand*{\rangelen}[1]{%
\expandafter\blx@rangelen@range\romannumeral-`\q%
#1\bibrangedash\bibrangedash&}
\def\blx@rangelen@range#1\bibrangedash#2\bibrangedash#3&{%
\ifblank{#3}
{\blx@rangelen@hyphen#1--&}
{\blx@rangelen@check{#1}{#2}}%
}
\def\blx@rangelen@hyphen#1-#2-#3&{%
\ifblank{#3}
{1}% No range at all: assume one page
{\blx@rangelen@check{#1}{#2}}%
}
\def\blx@rangelen@check#1#2{%
\expandafter\blx@rangelen@check@aux
\number\numexpr
\blx@rangelen@check@int{#2}
-
\blx@rangelen@check@int{#1}
\relax
&\stop
}
\def\blx@rangelen@check@aux#1\stop{%
\ifblank{#2}
{#1}
{0}%
}
\def\blx@rangelen@check@int#1{%
\ifblank{#1}
{0&}
{%
\if\number\numexpr0#1-0#1\relax0
#1
\else
0&
\fi
}%
}
The docs don't seem to define what \rangelen{a}
or \rangelen{a-b}
should give: I've gone with the easiest approach that anything without a range is one page, anything else must have two integer page numbers.
(I've left the basic integer test alone: without some proper unit tests I'm not sure it the behaviour differs. Perhaps one I'll look at in an expl3
context where I can easily do proper unit testing.)
@plk The reason that with the current definition \rangelen{10\bibrangedash 15}
works but \rangelen{\thefield{pages}}
fails is that TeX is not a functional language. Thus \rangelen
sees exactly the input as written here: there is no \bibrangedash
in \thefield{pages}
. Thus the solution is to expand the input (f-type expansion in expl3
parlance), which is what I've done in the above. The fact that the internals are then also defective in an expansion context is a separate issue!
@plk I've taken advantage of the fact that \numexpr
keeps going until it finds something that is not an expandable macro/primtiive, number or relation (+
, -
, _etc.): that lets me terminate the parsing cleanly and then use a bit of 'clean-up' code to find the appropriate result.
I realise about the functional thing but I was confused because when tracing, those two examples ended up being the same thing going in to the second level of macros. That is, \the{pages}
was expanded but the result wasn't matched by the macro argument pattern and everything was going into #1
.
Do you think the code you posted above is useable? Would be nice to fix this problem definitively ...
The computation in the given code is off by one (e.g., \rangelen{10-15}
gives 5
but it should give 6
). Maybe this is a mater of how it is going to be used, if it is used with pages
to count the number of pages, it is definitely off by 1.
@gvdgdo This can easily be altered: the only question is what is the expected behaviour.
Sorry to intrude on this issue.
I noticed that with the new definitions \rangelen{\bibrangedash10}
and \rangelen{-10}
differs: the latter gives 0 (as one would expect after reading the manual), but the former returns 1.
Maybe it would not be a bad idea to be able to distinguish an open-ended range (10-
) from a range without a start element (-10
), because the former might give rise to adding sequentes marker after the starting page number.
Anyway, the problem with how pages are counted right now is that \rangelen{10}
and \rangelen{10-11}
both yield 1, somewhat defeating the purpose of the test given in the manual as example.
Seeing that this test could in theory also be used to decide whether to use plural or singular p. or pp. for the page ranges, it seems natural to output 1
for a lone, single page and more for real ranges. So \rangelen{10-15}
should give 6, just as \rangelen{10-11}
should give 2.
@josephwright - do you think we can now fix this? It looks like we just have to agree on what the correct counts are for various cases?
@plk I think this is a case where unit testing would really help :-) In the absence of that, we need at least a tight spec on what the result of different input should be. As @moewew points out, it's odd that both \rangelen{10}
and \rangelen{10-11}
give 1
. I guess I'd favour logic:
0
but could also be -1
(commonly used as a flag)\rangelen{10-11}
is 2
I'm not sure about trying to distinguish the two open ended cases as different lengths, although I could I guess do 0
and -1
or something like that.
I think 0 and -1 are fine as markers for the two open range types. We need unit testing for biblatex in general. What would be ideal would be a non-pdf output like utf8 text only so we can "diff" for tests without the pain at the moment of kerning, spacing changing regularly with the engine so that accurate PDF comparison for output is really hard to maintain.
Unit testing: http://www.texdev.net/2014/05/27/testing-tex-lua-and-tex-and-not-just-for-luatex/ and upcoming TUGBoat by Frank (source at https://github.com/latex3/svn-mirror/blob/master/articles/lua-test-suite.tex). I'll look to write myself a proper set of inputs/outputs and alter \rangelen
over the coming days.
Any update Mr W? I'd like to close this if I can and get it into the dev branch.
Updated version with output
0
if the input is entirely blank1
if there is input but no range token (-
/\bibrangedash
)(n - m + 1)
for a range m-n
-1
for an open range m-
/-n
\newcommand*\rangelen[1]{%
\ifblank{#1}
{0}%
{%
\expandafter\blx@rangelen@range\romannumeral-`\q%
#1\bibrangedash\bibrangedash&%
}%
}
\def\blx@rangelen@range#1\bibrangedash#2\bibrangedash#3&{%
\ifblank{#3}
{\blx@rangelen@hyphen#1--&}
{\blx@rangelen@check{#1}{#2}}%
}
\def\blx@rangelen@hyphen#1-#2-#3&{%
\ifblank{#3}
{1}% No range at all: assume one page
{\blx@rangelen@check{#1}{#2}}%
}
\def\blx@rangelen@check#1#2{%
\expandafter\blx@rangelen@check@aux
\number\numexpr
\blx@rangelen@check@int{#2}
-
\blx@rangelen@check@int{#1}
+ 1
\relax
&\stop
}
\def\blx@rangelen@check@aux#1\stop{%
\ifblank{#2}
{#1}
{-1}%
}
\def\blx@rangelen@check@int#1{%
\ifblank{#1}
{0&}
{%
\if\number\numexpr0#1-0#1\relax0
#1
\else
0&
\fi
}%
}
The only thing this leaves awkward is the input \rangelen{-}
, which gives -1
but I'm not really sure what to do with!
Note that non-numerical pages also give the -1
value if there is an apparent range, so \rangelen{i-ii}
gives -1
. Again, I'm not sure what is wanted here (converting different representations of page numbers is doable but the auto-detection will not be much fun!).
We could make either 0
or -1
a more general value 'A page range cannot be determined: this includes the case of open ranges, non-numeric page numbers and so on.'
It's mainly a case of deciding what is wanted.
@josephwright - You know I just realised that this is probably much easier for biber to do. Perhaps if biber returns a value for all page ranges giving the length. Also non-numerical stuff is much easier using Unicode equivalance classes there. Something in the bbl like:
\field{pages}{10\bibrangedash 15}
\range{pages}{6}
for all fields of datatype "range"?
In fact it's harder than we thought because a range field can be multiple ranges: 10-12, 20-30
etc. and the length should probably the sum of all of them.
@josephwright - I did a quick test with perl and it's quite nice - I can generate rangelen for roman numerals too, irrespective of whether they are using the special U+216x and U+217x ranges or ASCII representations etc. This will save a lot of trouble in biblatex.
Well you make the calls, but a TeX-based solution works for 'everyone' not just for Biber people :-) I'd also be very cautious about taking on anything non-numeric, otherwise you get into the tricky cases such as 'i-x,1-10' or even worse 'i-10'!
Actually, they are already in my test cases and work fine ... any combination of roman and decimals will work, even in really strange Unicode cases. I would not remove the current implementation for bibtex users ...
@gvdgdo - please try bibaltex 3.0 dev and biber 2.0 dev from Sourceforge. Joseph Wright has created a better implementation of \rangelen
and there is a new macro \frangelen
which, when used with biber, takes the name of a range field like 'pages' and returns the length of the range in the field. This macro can handle multiple ranges in the same field, roman numerals, Unicode roman numerals, implicit ranges etc. and is generally more robust than \rangelen
. See the PDF manual.
@JosephWright There is a problem with the new definition of \rangelen in biblatex. Calls with open ranges e.g. \rangelen{1-}
fail with Missing $ inserted.
. As far as I can see the reason is that & has catcode 3 (math shift) at the moment of the definition of \rangelen
+ internal commands in biblatex2.sty (\catcode``\&=3
is set in line 50).
\documentclass{article}
\usepackage[]{biblatex}
\begin{document}
\rangelen{-1}
\end{document}
The issue is with \ifblank
: fix in hand.
Joseph Wright has created a better implementation of
\rangelen
and there is a new macro\frangelen
Is the documentation of \rangelen
correct? Contrary to what is written here, it says it takes a field as parameter. Also, \frangelen
does not seem to be documented at all.
It seems that
\rangelen
does not work as advertised in the example on page 194 of the manual.It seems it works if the input is an explicit range (e.g., "10-15") but not if passed implicitly (e.g.,
\thefield{pages}
)Here is an MWE: