plk / biblatex

biblatex is a sophisticated bibliography system for LaTeX users. It has considerably more features than traditional bibtex and supports UTF-8
507 stars 115 forks source link

Use of the .bbl file extension is confusing, since it's already in use by BibTeX #1034

Open bbeeton15 opened 4 years ago

bbeeton15 commented 4 years ago

Since the .bbl file extension is already in use by BibTeX, with very different file content, when a publisher receives a .bbl file the expectation is that this file contains the already formatted bibliography. The biblatex form, however, is more like a database entry, and an attempt to include it in the established workflow will result in errors and consternation. It's no surprise then that publishers are reluctant to accept biblatex input. If a different file extension were used (one that is not already in use by some other package), publishers might be more willing to consider establishing a parallel workflow, as long as it can be easily and reliably (and automatically) determined what is appropriate.

sieversMartin commented 4 years ago

Isn't the real problem, that you just can not get a ready-to-print bibliography like BibTeX's bbl file out of biblatex? It may no longer confuse people, who are use to the established bbl format, but the initial problem persists, that some publishers just want a formatted bibliography..

josephwright commented 4 years ago

@sieversMartin That's come up many times, and the bottom line is it's not doable with biblatex in the way it is using a 'classical' approach

sieversMartin commented 4 years ago

I know and I fully understand that looking at biblatex and biber's abilities. @plk just explained it again at the TUG conference.

sieversMartin commented 4 years ago

My point is, that changing extensions does not solve the underlying problem.

plk commented 4 years ago

I know it doesn't solve the underlying problem but I asked Barbara to open the issue as it was mentioned in a mail exchange and I realised that I'd never considered the whole .bbl issue as biblatex was using .bbl from the very first, when I inherited it. I suppose that using a different extension for biblatex would at least circumvent any confusion regarding file content expectations. We'd have to consider however, compiling utilities and environments which detect file changes (latexmk, emacs etc.)

hvoss49 commented 4 years ago

Am 13.08.2020 um 18:40 schrieb Barbara Beeton notifications@github.com:

Since the .bbl file extension is already in use by BibTeX, with very different file content, when a publisher receives a .bbl file the expectation is that this file contains the already formatted bibliography. The biblatex form, however, is more like a database entry, and an attempt to include it in the established workflow will result in errors and consternation. It's no surprise then that publishers are reluctant to accept biblatex input. If a different file extension were used (one that is not already in use by some other package), publishers might be more willing to consider establishing a parallel workflow, as long as it can be easily and reliably (and automatically) determined what is appropriate.

It is no problem to create the .bbl file with Biber and then insert this created bbl file into the documents source. Then the publisher do not need a biber run. Only the BibLaTeX package itself must be present.

\documentclass[]{article} \usepackage[style=authoryear,maxbibnames=99]{biblatex} \makeatletter \def\blx@bblfile{% \blx@secinit \begingroup \blx@bblstart

%%% insert here the created bbl file starting with \refsection

\blx@bblend \endgroup % global sorting as this is called at BeginDocument \csnumgdef{blx@labelnumber@\the\c@refsection}{0}} \makeatother \begin{document} A reference~\parencite{westfahl:space,aksin}

\printbibliography \end{document}

moewew commented 4 years ago

It is no problem to create the .bbl file with Biber and then insert this created bbl file into the documents source. Then the publisher do not need a biber run. Only the BibLaTeX package itself must be present.

That is true, but the issue with that is that the exact syntax of the .bbl file (as tracked by the biblatex bbl format version, which is currently at 3.1) changes occasionally when new features are implemented and internals restructured. That means that a .bbl file is usually coupled to certain biblatex versions (the bbl format version is not equal to the biblatex version and it is not increased with every biblatex release). This means that even a fully self-contained file with 'inline .bbl' essentially requires a certain biblatex version. This is a major issue for arXiv submissions: https://github.com/plk/biblatex/wiki/biblatex-and-the-arXiv

I have no experience in the publishing world, but my hunch would be that is on of the major headaches for publishers when it comes to biblatex adoption. The .bbl format is a bit of a moving target. The publisher simply can't guarantee that they'll be able to run the .bbl file uploaded by an author on their systems. (With normal .bbl files that is easy: They normally only contain very light markup with standard commands like \emph, \textbf, \url.) We (biblatex) could do better here by promising not to change the syntax of the .bbl file in backwards incompatible ways. (Whether that is feasible, I'm not quite sure. In any case, that would only change the issue in the medium or long term.)

Another issue with .bbl files is that they don't contain typesettable material, so I assume they are a pain to work with for publishers who convert the .tex source into a different format for production. (As mentioned by @sieversMartin.)

In my head biblatex is heavily inspired by jurabib, whose .bbl output is already quite database-y and not that close to the ready-to-typeset output plain.bst produces. Since biblatex originally used BibTeX the extension .bbl was a given. (I don't think BibTeX can be instructed to produce files with different file extensions.)

I'm not outright opposed to a change in the file extension here, but I think there is potential for the knock-on effects for documentation that is available all over the net, for third-party tools, and also for the maintainability of both the BibTeX and Biber backends. Furthermore, I seriously doubt that the file extension is the thing holding publishers back from adopting biblatex.

PS: The .bbl files produced for biblatex already have a quite unique signature, the start off like

% $ biblatex auxiliary file $
% $ biblatex bbl format version 3.1 $
% Do not modify the above lines!
%
% This is an auxiliary file used by the 'biblatex' package.
% This file may safely be deleted. It will be recreated by
% biber as required.
%
\begingroup
\makeatletter
\@ifundefined{ver@biblatex.sty}
  {\@latex@error
     {Missing 'biblatex' package}
     {The bibliography requires the 'biblatex' package.}
      \aftergroup\endinput}
  {}
\endgroup

So a publisher can probably tell quite reliably if they have a biblatex .bbl file on their hands by checking if the first line is % $ biblatex auxiliary file $. (True, they'd have to look at the contents of the file and not the extension, which is more work, but it seems not too complex.)

bbeeton15 commented 4 years ago

A bit of history. BibTeX was first released in the late 1980s. Version .98c is dated in 1988. No significant changes have been made until 2010, when a minor bug fix was made to bring it to version .99d. In the interim, many improvements were proposed (many by me) to render it capable of recording information as required by publications of the American Mathematical Society (AMS). But these never happened, and BibTeX never achieved its promise.

In frustration, in the early 1990s, the amsrefs package was created, designed principally by Michael Downes. (It too uses the .bbl file extension, originally as a mechanism to provide forward and backward translation to BibTeX, but, after Michael's death in 2003, the developer who took over the project rejected that connection; I'm not sure whether .bbl retains any useful meaning in the amsrefs context.) In any event, AMS now converts all bibliographic information to amsrefs, and uses that in the production stream.

Instructions to authors request the use of BibTeX or amsrefs. If BibTeX is used, recommended styles are amsplain, amsalpha, or (if author-year style is wanted), natbib. amsrefs data is entered directly into the main source file; otherwise, it is requested that the .bbl be inserted into the source at the location where the bibliography should appear. A .bbl file from biblatex does wreak havoc, and it is not possible to identify the exact problem automatically; manual inspection is required, and in a production environment, it's people time that costs the most money.

The AMS workflow is the only one I am personally familiar with. Although I would expect serious objections to any suggestion to accept biblatex, biblatex has many features that I believe would solve some thorny issues that remain with present practices. Another avenue that might be pursued is to suggest to Math Reviews, which now offers up bibliographic information for the items that make up its contents in both BibTeX and amsrefs form, that they also offer biblatex. (I will follow up on this suggestion.) But that would be impractical and/or inadvisable given the ambiguous .bbl naming.

moewew commented 4 years ago

Thank you very much for the historical context. It is always nice to see how things developed.

I'm trying to understand the exact problem you want to solve by changing the file extension. If I understand your last two paragraphs correctly, the problem is not so much that publishers don't know how to deal with .bbl files (or couldn't themselves check whether it is for biblatex or classic thebibliography), the issue appears to be that people who don't know that much about the difference between biblatex, classical BibTeX, natbib etc. and/or haven't read the author guidelines telling them to use BibTeX or AMSRefs see the 'integrate your .bbl file into the document' request and do that even with biblatex's .bbls, which then break for the publisher (because of version issues as discussed above and/or other issues). So the main advantage of renaming biblatex's .bbl files would be that those people simply won't get a .bbl file from biblatex they could try to integrate, telling them immediately that something about their setup is wrong. We'd then hope they poke around the documentation and online help forums to find that they can't use biblatex. Is that a fair summary of your goals or is there more? I can see that how this might work in a few cases. (Of course only until someone figures out that with backend=bibtex, biblatex still has .bbl files, because BibTeX can't produce anything else, and then tells people to use backend=bibtex, to get their .bbl. ...)

I'm not quite sure I understand your last point about Math Reviews. I currently don't have access to MathSciNet because I'm at home, but from https://mathscinet.ams.org/mathscinet-mref?ref=Andrew+Comech+%0D%0AOptimal+regularity+of+Fourier+integral+operators+with+one-sided+folds.%0D%0AComm.+Partial+Differential+Equations&dataType=bibtex it appears that MathSciNet serves .bib output, AMSRefs and vanilla TeX for use in thebibliography. Since biblatex still works with .bib input, the .bib file served by that tool will do fine with biblatex (there are some small differences between best practices for .bib files for biblatex and BibTeX styles, but BibTeX styles are also not uniform in their support for fields and entry types, so some manual intervention is expected, automatic tools can't get everything right for every style). There is no point in providing something that could be copied-and-pasted into a biblatex .bbl file: As discussed above the .bbl file format/syntax is a moving target. But more importantly, the .bbl file written for biblatex can contain quite a lot of context dependent data (uniqueness calculations, label calculations, ...) that MathSciNet simply could not guess/provide sensibly. Plus, users are not supposed to manipulate the .bbl file manually. (While BibTeX's .bbl files lend themselves to manipulation because they contain typesettable material and can and often should be included directly into the .tex file for submission.)

bbeeton15 commented 4 years ago

(Apologies for the delay in responding. I've been editing items from TUG 2020 for publication.)

The problem I'm trying to address is the problem of modifying a production workflow to accommodate something with the same name as an existing component, but very different, and incompatible, content. That is very difficult and disruptive. It is not trivial, but much less difficult, to construct a parallel workflow, and if the results are found to be superior, the effort might be deemed worthwhile.

Regarding Math Reviews/MathSciNet, the database holding that data includes more information than is handled by most BibTeX .bst files. It's my understanding that the .bib entries distributed from a MathSciNet query are constructed so that they will be handled properly by the most common .bst files in common distribution (including, of course, those from the AMS). Something "richer" may be possible; I can inquire.

moewew commented 4 years ago

Mhh, I guess my question was where exactly in the workflow the problem with the double meaning of .bbl file occurs.

If it's when the publisher gets given a .bbl file directly, they can easily check if the first line is % $ biblatex auxiliary file $ to see if they are dealing with biblatex (or some of the other markers exhibited above). That's not as comfortable as checking file extensions, but strikes me as doable. If the problem is with users pasting the contents of .bbl files into documents, then the publisher has no file extension to go by anyway. It appears to me that in that case the main focus would have to be on educating the user to tell biblatex from BibTeX. (In which case a different file extension might indeed help a bit.)

I have probably overlooked many other issues. That's why I'm asking, I'm trying to understand when exactly the fact that we use the .bbl file extension becomes a problem to see if there is something we can do short of going the radical step of changing our file extension.


I really don't think the MathSciNet output could be much better. In any case the issues biblatex faces with the example output for https://mathscinet.ams.org/mathscinet-mref?ref=Andrew+Comech+%0D%0AOptimal+regularity+of+Fourier+integral+operators+with+one-sided+folds.%0D%0AComm.+Partial+Differential+Equations&dataType=bibtex

@article {MR1697488,
    AUTHOR = {Comech, Andrew},
     TITLE = {Optimal regularity of {F}ourier integral operators with
              one-sided folds},
   JOURNAL = {Comm. Partial Differential Equations},
  FJOURNAL = {Communications in Partial Differential Equations},
    VOLUME = {24},
      YEAR = {1999},
    NUMBER = {7-8},
     PAGES = {1263--1281},
      ISSN = {0360-5302},
   MRCLASS = {35S30 (47G30 58J40)},
  MRNUMBER = {1697488},
MRREVIEWER = {Luigi Rodino},
       DOI = {10.1080/03605309908821465},
       URL = {https://doi.org/10.1080/03605309908821465},
}

are not different from the issues some 'standard' .bst files would face (in this case doi and url duplicate information, {F}ourier may have kerning implications, so I would prefer {Fourier}, but of course that would be an issue if you want an ALL CAPS title).

I'd guess that the small improvements a biblatex-specific .bib file could bring would be outweighed by the confusion about the fact that MathSciNet now serves two different .bib formats.

sieversMartin commented 4 years ago

Just found package biblatex2bibitem, but haven't tested it yet. Any experiences?

bbeeton15 commented 4 years ago

The contention is made that a publisher can tell, trivially, if a .bbl file is produced by BibTeX or biblatex. I guess I did not make clear the fact that the publisher may have only a limited editorial staff, whose members may be essentially unfamiliar even with LaTeX. The actual formatting and production may be contracted out to some other organization.

At the AMS, everything is scripted. When a submission for a journal is received, the essential procedure is to place the files into a dedicated directory and "push a button". What is expected to come out the other end is a pdf file that is ready to be copyedited. If anything fails, the job must be turned over to someone knowledgeable to determine what is wrong and fix it. If what has been submitted diverges from what is expected, there is no way to avoid problems. Under the procedures in place at the time I retired, most problems were associated with graphics or with authors redefining basic commands for their own convenience. Although these are a nuisance, it is well understood how to fix them. biblatex is not understood at all.

Let's go back to the script. One of the steps in the script is to convert .bbl material to amsrefs form, to ensure uniformity in the end product. This mechanism was in place since before biblatex became popular. As has been mentioned many times (elsewhere, if not here), publishers are slow to replace a functioning workflow, and AMS is no exception. An influx of .bbl files in biblatex format would indeed fail (possibly corrupting something else in the process), and the staff faced with trying to recover would be untrained in dealing with the problem.

A biblatex .bbl file by another name would be obvious, and processing would halt before any damage is done, indeed, before the file is read. There would still be the problem of how to handle it, but, as I stated earlier, a different workflow could be devised (if found to be advantageous) to handle the new component.

I'm not questioning the ability of biblatex to produce superior output. I'm merely trying to provide a possible way of managing the different mechanism that would be required to insert it into an operational production workflow.

alex-ball commented 3 years ago

I think the contention is that a script can tell, trivially, if a .bbl file is produced by BibTeX or biblatex. That will be dependent on the script and the environment, but just as an example, in a UNIX-like environment you could do something like this in a Bash script to halt the process or spawn a different one:

if [[ $(head -n 1 $BBLFILE) == '% $ biblatex auxiliary file $' ]]
then
    echo "$BBLFILE was produced by biblatex. Aborting."
    exit 1
# You could have an 'else' here to handle a regular BibTeX file
fi

(where $BBLFILE is the .bbl file).

plk commented 1 year ago

It's been a while since we looked at this. How about a concrete suggestion that we output two files for a while - a .bbl and something else, same contents. Change biblatex to use the new extension and publicise the change for external tools for a year or so, easing adoption for such chains?

perstar commented 1 year ago

Yes, I think it is a good idea to take action to change the extension. It could continue to accept both output files (with priority to the new one) for a long time after it has stopped ever outputting files with the old name.

moewew commented 1 year ago

I think we shouldn't lose sight of the fact that we still support BibTeX as a backend. AFAICS BibTeX can only produce .bbl files. The .bbl files produced by BibTeX for biblatex have the exact same issue for publishers as Biber-produced .bbl files. So having Biber switch file extensions will just create a bigger mess on the biblatex side, because we have to support .bbl and the new extension, but would not necessarily make things better (especially if people get taught "BibTeX -> .bbl").