Closed Omikhleia closed 1 week ago
Slightly relates to #1860 (as minimal TeX-like stuff people might expect in a bibTeX file, the \&
and ~
are maybe both common enough to be properly handled).
Slightly relates to #1860 (as minimal TeX-like stuff people might expect in a bibTeX file, the
\&
and~
are maybe both common enough to be properly handled).
Along the same line of thinking, -
vs. --
in page ranges might need to be checked for consistency (also as argument to \cite
)
Having to manually edit bibliographies to replace
&
by&
is cumbersome, we'd need to avoid it, or at least have some way to bypass it.
This is definitely not something we should expect to be in the input, we need to apply XML character escaping ourselves.
I'm not familiar with other issues with inputs, but if TeX-isms like \&
and ~
are standard we need to decode those too, or if they are common but not standard maybe we need an optional setting for handling them or not on loading bibliographies. Perhaps an argument to the loaded or a setting for whether the input is expected to be plain, XML, SIL, TeX, Markdown, or whatever is in order. Defaulting to plain or whatever is standard or most common.
Perhaps an argument to the loaded or a setting for whether the input is expected to be plain, ... or whatever
Food for thought:
\loadbibliography
could work, but it is not very user-friendly (One would have to check the input files carefully and knowingly...)The crux of the matter is that the bibtex format was design with TeX in mind, hence it cannot be made completely portable. (The original need to escape \&
, I'd guess, came from &
being an active character in TeX for arrays...).
I think that the safest approach (to start with) is to consider by default that the input does not contain any markup. (We are not going to be able to support TeX/LaTeX, or @preamble
blocks with TeX-like instructions, anyway).
IMHO, the best course of action is to assume the bib file is self-defined, written in a minimal "portable" subset, i.e. not containing any TeX, XML, SIL or whatever constructs, exception made of the really common ones (--
, \&
and ~
).
FWIW, as of other most "common" input issues (I might comment on them separately at some point) are likely:
author = {{\relax Ch}ristopher Doe}
(we might want BibLaTeX's §3.4 extended name syntax before, though it doesn't fix it all)We are not going to be able to support TeX/LaTeX
BTW, For the record, Typst does support some minimal interpretation of TeX-like input.
The problem I see there is that we'll never know what's really minimal...
For strings such as On some stuff & other things, we currently have to format our bibtex files as follows for use with SILE:
If we don't XML-escape the
&
, we get an error...This is dues to SILE supporting XML entries in bibliography, which is non-standard... albeit interesting, e.g. if one wants to markup parts of entries in SIL XML. (The true use case however is not the user inserting markup, it's for the internal logic of rendering titles in italic, etc.)
It's not completely obvious, as:
{On some stuff \& other things}
(and the big picture here is that it allows any (La)TeX constructs in field values, but also suffers from its own rules, hence the ampersand escaping I guess...){On some stuff & other things}
Having to manually edit bibliographies to replace
&
by&
is cumbersome, we'd need to avoid it, or at least have some way to bypass it. (I'd also be interested in supporting Djot/Markdown in bibTeX files, but that's another hornet's nest :p )