plk / biber

Backend processor for BibLaTeX
Artistic License 2.0
340 stars 38 forks source link

support an extending config for tool mode #275

Closed 927589452 closed 5 years ago

927589452 commented 5 years ago

As worked out here, it would be useful to have an option to supply biber in --tool (or other) mode with an --extends-config config, to support adding application specific fields even in tool mode.

This is possible by copying a full configuration and editing it as described here

The line, which makes it hard to keep up with the orginal biblatex configuration is

Copy that (yes, all that) in your clean-bibfiles.conf, just below your sourcemap specifications, as you had them originally defined.

To work around this I would like to propose the following solution: With this being the system provided file

<?xml version="1.0" encoding="UTF-8"?>
<config>
    <fields>
      <field fieldtype="field" datatype="literal">timestamp</field>
      <field fieldtype="field" datatype="literal">someother</field>
    </fields>
    <entryfields>
      <field>timestamp</field>
      <field>someother</field>
    </entryfields>
</config>

and this is the user provided file

<?xml version="1.0" encoding="UTF-8"?>
<config>
    <fields>
      <field fieldtype="field" datatype="literal">source</field>
      <field fieldtype="field" datatype="othertype">someother</field>
    </fields>
    <entryfields>
      <field>source</field>
      <field>someother</field>
    </entryfields>
</config>

the file which is internally used is

<?xml version="1.0" encoding="UTF-8"?>
<config>
    <fields>
      <field fieldtype="field" datatype="literal">timestamp</field>
      <field fieldtype="field" datatype="literal">source</field>
      <field fieldtype="field" datatype="othertype">someother</field>
    </fields>
    <entryfields>
      <field>timestamp</field>
      <field>someother</field>
      <field>source</field>
    </entryfields>
</config>

eg a defined merge behaviour as in a user provided field probably should supercede a default field and a merge for non existing types etc.

this would make the using the tool mode with custom data models much easier i think

plk commented 5 years ago

I agree that this would be a bit nicer but it is very complex to merge like this - it needs checks for duplicates and conflicts etc. and is a relatively niche requirement. Generally, once you have a merged file, it doesn't need much maintenance. I could add an option to dump a skeleton .conf file containing the default options, including the datamodel?

moewew commented 5 years ago

Obviously there is a theoretical risk of duplicates and conflicts, but I'm wondering how relevant that is.

It appears to me that if it is possible to read the additional .conf after the main .conf with the rule that the last definition wins, we'd get quite far already. (There might be issues with settings that depend on other settings, but this is probably not that big an issue for the data model.)

plk commented 5 years ago

Please try the DEV version. Note that you must make sure your config file is the correct format and accords with the generic .conf:

<?xml version="1.0" encoding="UTF-8"?>
<config>
  <mincrossrefs>5</mincrossrefs>
  <sortingtemplate name="tool">
    <sort order="1">
      <sortitem order="1">citeorderX</sortitem>
    </sort>
  </sortingtemplate>
  <datamodel>
    <fields>
      <field fieldtype="field" datatype="literal">newliteralfield</field>
    </fields>
  </datamodel>
</config>

This should merge on top of the default tool mode .conf

927589452 commented 5 years ago

On 19-07-26 14:22:26, plk wrote:

I could add an option to dump a skeleton .conf file containing the default options, including the datamodel? Currently I would get this in two steps from biber --tool-conf and copying it. Generally, once you have a merged file, it doesn't need much maintenance. The idea is to provide project specific configuration for example for jabref, to supply their users with; as the BibLaTeX and biber combination should be version locked, such a project would either have to provide a configuration for each version, have users edit the configuration them selves or have users not use --tool mode.

Obviously, there is a theoretical risk of duplicates and conflicts, but I'm wondering how relevant that is.

I agree that this would be a bit nicer but it is very complex to merge like this - it needs checks for duplicates and conflicts etc. and is a relatively niche requirement. >

Though is this really a risk? If there is a defined order to override it like this

It appears to me that if it is possible to read the additional .conf after the main .conf with the rule that the last definition wins, we'd get quite far already. (There might be issues with settings that depend on other +settings, but this is probably not that big an issue for the data model.)

If this is the behaviour, one could probably even chain multiple extending config files.

plk commented 5 years ago

Please try the DEV version on Sourceforge. It is now possible to include only changed/additional datamodel elements in the user-supplied config file rather than replicating the entire datamodel.

927589452 commented 5 years ago

Please try the DEV version on Sourceforge. It is now possible to include only changed/additional datamodel elements in the user-supplied config file rather than replicating the entire datamodel.

Sorry for taking so long, have to build for $ uname -a: FreeBSD hostname 12.0-RELEASE FreeBSD 12.0-RELEASE r341666 GENERIC amd64 1200086 and am struggling a bit

927589452 commented 5 years ago

and I am pretty sure this build is (on my machine) borked

 % biber --tool literature.bib
INFO - This is Biber 2.13 (beta) running in TOOL mode
INFO - Logfile is 'literature.bib.blg'
INFO - Globbing data source 'literature.bib'
INFO - Globbed data source 'literature.bib' to literature.bib
INFO - Looking for bibtex format file 'literature.bib'
INFO - LaTeX decoding ...
INFO - Found BibTeX data source 'literature.bib'
[1]    45144 bus error (core dumped)  biber --tool literature.bib

If you could provide me with a FreeBSD build I would love to test it

927589452 commented 5 years ago

Ok, it seems to be running

$ ./bin/biber -v
biber version: 2.13 (beta)

but it seems to be working incorrectly; these give the timestamp error

/bin/biber --conf extending.cfg2 --tool --validate-datamodel literature_min.bib --debug 2&>1 > null_extended.log
/bin/biber --tool --validate-datamodel literature_min.bib --debug 2&>1 > no_extended.log

no_extended.log null_extended.log and this one doesn't like the replacement

/bin/biber --conf extending.cfg --tool --validate-datamodel literature_min.bib --debug 2&>1 > extended.log

extended.log all run against

% Encoding: UTF-8
@BOOK{Eis2,
AUTHOR = {Eisenbud, David},
YEAR = {2006},
TITLE = {The Geometry of Syzygies - A Second Course in Algebraic Geometry and Commutative Algebra},
EDITION = {},
ISBN = {978-0-387-26456-1},
PUBLISHER = {Springer Science \& Business Media},
TIMESTAMP={20190731},
ADDRESS = {Berlin Heidelberg},
}

But I am not sure, if my build is broken.

plk commented 5 years ago

No, I think your build is correct, this was a bug. Please try the latest code now. Note that you must give both the type and valid entryfields information for a new field. For example, this declares the type of the field and also that it is valid in all entrytypes:

<?xml version="1.0" encoding="UTF-8"?>
<config>
  <datamodel>
    <fields>
      <field fieldtype="field" datatype="literal">timestamp</field>
    </fields>
    <entryfields>
      <field>timestamp</field>
    </entryfields>
</datamodel>
</config>
927589452 commented 5 years ago

Ok sorry for taking so long; didnt't see a difference in the outputs, but there was; I just always ran with --validate-datamodel and there seems to be a bug in the --validate-datamodel:

./bin/biber --configfile=extending.cfg --tool literature_min.bib --output-file literature_min_bibertool_novalidate.bib.txt  --debug 2&>1 > novalidate.log.txt
./bin/biber --configfile=extending.cfg --tool --validate-datamodel literature_min.bib --output-file literature_min_bibertool_validate.bib.txt  --debug 2&>1 > validate.log.txt

with literature_min.bib as before and extending.cfg being the one you provided

literature_min_bibertool_novalidate.bib.txt novalidate.log.txt literature_min_bibertool_validate.bib.txt

validate.log.txt

plk commented 5 years ago

I need to see the contents of extended.cfg and also a comparison with the same .bib - the two examples use different entrytypes (book and misc).

927589452 commented 5 years ago

I need to see the contents of extended.cfg and also a comparison with the same .bib - the two examples use different entrytypes (book and misc).

with literature_min.bib as before and extending.cfg being the one you provided

literature_min.bib.txt extending.cfg.txt

the validating log shows this: WARN - Datamodel: Entry 'Eis2' (literature_min.bib): Invalid entry type 'book' - defaulting to 'misc'

plk commented 5 years ago

Are you sure you are using the latest DEV code? This is what I get with your .bib and .conf which means it all seems to be working fine.

> biber --tool -g t.conf --validate-datamodel t.bib
INFO - This is Biber 2.13 (beta) running in TOOL mode
INFO - Config file is 't.conf'
INFO - Logfile is 't.bib.blg'
INFO - Globbing data source 't.bib'
INFO - Globbed data source 't.bib' to t.bib
INFO - Looking for bibtex format file 't.bib'
INFO - LaTeX decoding ...
INFO - Found BibTeX data source 't.bib'
INFO - Overriding locale 'en_US' defaults 'normalization = NFD' with 'normalization = prenormalized'
INFO - Overriding locale 'en_US' defaults 'variable = shifted' with 'variable = non-ignorable'
INFO - Sorting list 'tool/global//global/global' of type 'entry' with template 'tool' and locale 'en_US'
INFO - No sort tailoring available for locale 'en_US'
INFO - Writing 't_bibertool.bib' with encoding 'UTF-8'
INFO - Output to t_bibertool.bib
927589452 commented 5 years ago

Are you sure you are using the latest DEV code? This is what I get with your .bib and .conf which means it all seems to be working fine.

> biber --tool -g t.conf --validate-datamodel t.bib
INFO - This is Biber 2.13 (beta) running in TOOL mode
INFO - Config file is 't.conf'
INFO - Logfile is 't.bib.blg'
INFO - Globbing data source 't.bib'
INFO - Globbed data source 't.bib' to t.bib
INFO - Looking for bibtex format file 't.bib'
INFO - LaTeX decoding ...
INFO - Found BibTeX data source 't.bib'
INFO - Overriding locale 'en_US' defaults 'normalization = NFD' with 'normalization = prenormalized'
INFO - Overriding locale 'en_US' defaults 'variable = shifted' with 'variable = non-ignorable'
INFO - Sorting list 'tool/global//global/global' of type 'entry' with template 'tool' and locale 'en_US'
INFO - No sort tailoring available for locale 'en_US'
INFO - Writing 't_bibertool.bib' with encoding 'UTF-8'
INFO - Output to t_bibertool.bib

Yes, I get the same, but if I use --validate-datamodel it turns the output bib into misc and throws the errors. See here:

./bin/biber --configfile=extending.cfg --tool literature_min.bib --output-file literature_min_bibertool__novalidate_.bib.txt  --debug 2&>1 > _novalidate_.log.txt
./bin/biber --configfile=extending.cfg --tool *--validate-datamodel* literature_min.bib --output-file literature_min_bibertool_*validate*.bib.txt  --debug 2&>1 > *validate*.log.txt
927589452 commented 5 years ago

Though maybe I just didn't do the build correctly

plk commented 5 years ago

I think it's likely as I just re-ran your example and it validates fine with no errors and outputs a @BOOK entry with the --validate-datamodel option.

gusbrs commented 10 months ago

For the record, I was just now made aware of this change through an answer at TeX.SX by @plk , and I've updated my related answers there, since I think they are good overall references on the matter. This makes sense from TeX.SX's perspective but, since the original issue here used one of them to motivate this improvement, now one needs to check the edit history of the answer to find the quote from the OP.

Btw, nice improvement. Thanks!