plk / biber

Backend processor for BibLaTeX
Artistic License 2.0
336 stars 37 forks source link

biber tool (2.13) removes unknown crossrefs. #285

Closed tnull closed 5 years ago

tnull commented 5 years ago

I use the biber tool to keep a number of bib files (multiple files with article entries, one with crossref'ed proceedings entries) sorted by regularly running biber --tool --configfile sort.conf --outfile "in.bib" "out.bib" with the following simple sort.conf:

<config>
    <sortingtemplate name="tool">
        <sort order="1">
            <sortitem order="1">author</sortitem>
        </sort>
        <sort order="2">
            <sortitem order="1">year</sortitem>
        </sort>
        <sort order="3">
            <sortitem order="1">title</sortitem>
        </sort>
    </sortingtemplate>
</config>

So far, this always resulted in sorted, tidied-up entries. However, suddenly (presumably after upgrading to biber 2.13), the tool wants to check the crossrefs and then just goes ahead and removes all unknown crossrefs, which seem to be identical to all crossrefs in my case:

WARN - I didn't find a database entry for crossref 'aaa' in entry 'bbb' - ignoring (section 99999)
WARN - I didn't find a database entry for crossref 'ccc' in entry 'ddd' - ignoring (section 99999)
WARN - I didn't find a database entry for crossref 'eee' in entry 'fff' - ignoring (section 99999)
...
...
...

I looked through the changelogs of recent version updates, but could not find any changes that might cause this change in behavior. So, what changed? Is this now the desired behavior? And how do I keep my crossref entries?

Thanks!

plk commented 5 years ago

Hmm, can you put together a simple example of this problem so I can reproduce it? I suspect something happened with the --output-resolve-crossrefs option.

tnull commented 5 years ago

Yes, here is a simple example:

/tmp/bib_test> cat test.bib
@INPROCEEDINGS{rohrer19kadcast,
  AUTHOR = {Rohrer, Elias and Tschorsch, Florian},
  TITLE = {Kadcast: {A} Structured Approach to Broadcast in Blockchain Networks},
crossref = {aft19},
}

/tmp/bib_test> cat crossref.bib
@PROCEEDINGS{aft19,
  LOCATION = {Zurich, Switzerland},
  BOOKTITLE = {AFT '19: Proceedings of the first ACM conference on Advances in Financial Technologies},
  DATE = {2019-10},
}

/tmp/bib_test> cat sort.conf
<config>
    <sortingtemplate name="tool">
        <sort order="1">
            <sortitem order="1">author</sortitem>
        </sort>
        <sort order="2">
            <sortitem order="1">year</sortitem>
        </sort>
        <sort order="3">
            <sortitem order="1">title</sortitem>
        </sort>
    </sortingtemplate>
</config>

/tmp/bib_test> biber --tool --configfile sort.conf --outfile test_out.bib test.bib
INFO - This is Biber 2.13 running in TOOL mode
INFO - Config file is 'sort.conf'
INFO - Logfile is 'test.bib.blg'
INFO - Globbing data source 'test.bib'
INFO - Globbed data source 'test.bib' to test.bib
INFO - Looking for bibtex format file 'test.bib'
INFO - LaTeX decoding ...
INFO - Found BibTeX data source 'test.bib'
WARN - I didn't find a database entry for crossref 'aft19' in entry 'rohrer19kadcast' - ignoring (section 99999)
INFO - Overriding locale 'en_US' defaults 'variable = shifted' with 'variable = non-ignorable'
INFO - Overriding locale 'en_US' defaults 'normalization = NFD' with 'normalization = prenormalized'
INFO - Sorting list 'tool/global//global/global' of type 'entry' with template 'tool' and locale 'en_US'
INFO - No sort tailoring available for locale 'en_US'
INFO - Writing 'test_out.bib' with encoding 'UTF-8'
INFO - Output to test_out.bib
INFO - WARNINGS: 1

/tmp/bib_test> cat test_out.bib
@INPROCEEDINGS{rohrer19kadcast,
  AUTHOR = {Rohrer, Elias and Tschorsch, Florian},
  TITLE = {Kadcast: {A} Structured Approach to Broadcast in Blockchain Networks},
}

/tmp/bib_test>
plk commented 5 years ago

Did this ever work? You are not mentioning crossref.bib anywhere so biber can't know about its contents? I could make tool mode take any number of .bib files on the command-line perhaps but I would be interested first in how this ever worked.

plk commented 5 years ago

I have done this in 2.14 DEV version (on SourceForge in the development folder) as this looks useful generally. Your example now works for me with 2.14 DEV version by running:

biber --tool --configfile sort.conf --outfile test_out.bib test.bib crossrefs.bib
tnull commented 5 years ago

Yes, so far this worked without a problem. I have a folder with multiple bib files, one being the crossrefs.bib. I used to sort all files in place by running the following script:

#!/bin/sh

BIBERBIN=$(which biber)

for i in *.bib; do
    [ -x "$BIBERBIN" ] || break
    [ -e "$i" ] || break
    $BIBERBIN --tool --configfile sort.conf --outfile "$i" "$i"
done

Thanks for the great support and the adjustment! I'll update to 2.14 DEV then and adjust my script accordingly. ;)

tnull commented 5 years ago

Ah, I just tried your solution: now the biber tool adds all relevant entries of crossref.bib to test.bib, right? How can I keep them seperate? Is there a way to either simply ignore the crossref entries or do the checking without adding them to the bib file?

plk commented 5 years ago

I need to check this later - the change is that when you just called this on test.bib, there were no warnings and just test.bib entries were sorted and in the output?

tnull commented 5 years ago

Yes, before the change, test.bib would just be sorted and unified, but would still have the crossref entries.

Now, with your 2.14-DEV solution I can decide whether to call the biber tool with the addition of the crossref.bib (which then adds all relevant @proceedings-entries from crossref.bib to test.bib), or to have all crossrefs removed from test.bib.

plk commented 5 years ago

My mistake - this was a change in behaviour as there were other requests to process and remove non-existent crossrefs in tool mode. I have added a new biber option --tool-noremove-missing-dependants which is in 2.14 DEV and which disables this new behaviour for cases like yours where crossrefs etc. are not available by design.

tnull commented 5 years ago

Okay, thank you very much for the time and effort!

One last question: is there a way for me to have my crossrefs checked (and get warnings if some are missing/misspelled), without having them removed or merged? If not, I'll be more than happy to just use --tool-noremove-missing-dependants and keep doing what I've been doing.

plk commented 5 years ago

I have changed it so that the warning always happens but the option decides whether or not the missing ref is removed. There is no real "merge" per se, tool mode simply only writes out one file and so if you read in multiple files, there is only one place for the data to go ...

tnull commented 5 years ago

Okay, thank you, seems to work as expected now!