plk / biber

Backend processor for BibLaTeX
Artistic License 2.0
336 stars 37 forks source link

Biber became extremely slow #62

Closed simifilm closed 9 years ago

simifilm commented 9 years ago

I'm currently on the latest dev builds of 'biber', and somehow it got extremely slow. I can't give you exact figures and I know that the duration of a 'biber' run depends on several variables. But where before I was somewhere between 20 and 40 seconds for a 'biber' run, I'm now – with a comparable project – in the regions of more than five minutes. 'biber' remains forever at the 'Looking for bibtex format file' stage. It's a 6.2MB file with almost 6700 entries. While this is probably big, it's the same file I've been working with for years. It got, of course, bigger over time, but always gradually. There wasn't a huge increase which would explain the slowing down.

This is on OSX 10.10.2 with Perl 5.20.1.

plk commented 9 years ago

Are you using the raw perl version with your own perl or the actual OSX binary builds? Do you have a MWE with your bib file I can use to debug?

simifilm commented 9 years ago

I'm using my own builds with the latest additions from Github and the current perl from macports.


\usepackage[utf8]{inputenc}

\usepackage[style=fiwi]{biblatex}

\addbibresource{germanistik2.bib}

\begin{document}

\cite[]{Spiegel.S:2010b}

\printbibliography

\end{document}

You can get the .bib file here: https://www.dropbox.com/s/qm89wr1v4tuhlha/germanistik2.bib?dl=0

plk commented 9 years ago

This exact MWE with 1 cite is a problem for you? It takes about a second for me ...

simifilm commented 9 years ago

Biber needs about two minutes here.

Note: I'm not talking about LaTeX compilation but about the time biber needs.

simifilm commented 9 years ago

I timed it now – in debug mode – and it took almost four minutes. Something else I noticed: perl grabs huge amounts of memory during this. Activity monitor show more than 1.6GB.

You can get the log from debug mode here: https://dl.dropboxusercontent.com/u/35828/Untitled.blg

plk commented 9 years ago

You are building your own binary? What it you just install the perl version from the debug tree and run it without building a binary? (I mean, run "Build install" from the checked out biber DEV source and then run the script that this installs instead of a built binary).

simifilm commented 9 years ago

I'm not sure I understand you, maybe there's a problem of nomenclature. What I do is I get the latest changes from GitHub and then do the following sequence:

perl Build.PL (if necessary)
build
sudo build install

Is this what you mean by "run the script"?

plk commented 9 years ago

Ah, ok, that's fine. I wasn't sure if you were really running the scripts in the dist/ subdir to create an actual binary but if not, we're testing with the same setup.

plk commented 9 years ago

According to your log, there are some user sourcemaps? If you try this MWE which I am using, what do you see?

\documentclass{article}
\usepackage[utf8]{inputenc}
\usepackage[style=fiwi]{biblatex}
\addbibresource{test5.bib}
\begin{document}
\cite[]{Spiegel.S:2010b}
\printbibliography
\end{document}

For me this takes a few seconds. I see in your log that all of the extra time is going in the bibtex caching stage. Do you have the latest version of Text::BibTeX installed?

simifilm commented 9 years ago

What's the content of test5.bib in your example. The same as my original .bib file?

plk commented 9 years ago

Yes, it's the one from your link above version:

%% Created for Simon Spiegel at 2015-03-12 22:42:46 +0100 
simifilm commented 9 years ago

Your MWE with my germanistik2.bib file takes more than 90s for biber to finish. Text::BibTeX is up to date.

Here's my biber.conf file:

<?xml version="1.0" encoding="UTF-8"?>
<config>
<sourcemap>
<maps datatype="bibtex" map_overwrite="1">
    <map map_overwrite="1">
    <map_step map_field_source="AUTHOR"
            map_match="Csicsery-Ronay, Istvan, Jr."
            map_replace="Csicsery-Ronay, Istvan"/>
      </map>
</maps>      
</sourcemap>
</config>
plk commented 9 years ago

Hmm, even with your .conf file, it takes about 4 seconds for me. What version of perl are you using exactly?

simifilm commented 9 years ago

This is perl 5, version 20, subversion 1 (v5.20.1) built for darwin-thread-multi-2level

plk commented 9 years ago

Hmm, same as me. perl Build.PL; Build installdeps doesn't say you need to update any modules?

simifilm commented 9 years ago

Nope, everything there.

plk commented 9 years ago

Ok, what about trying the latest DEV binary version from SF - let's see if that gives the same problems.

simifilm commented 9 years ago

This is indeed much quicker, a few seconds …

plk commented 9 years ago

Then I think it must be something to do with your perl install. One suspicion - try this - it should just return nothing if all is ok:

perl -MList::MoreUtils::XS -e 'exit 0;'
simifilm commented 9 years ago

This does indeed return nothing.

plk commented 9 years ago

Ok, try a CPAN force reinstall of Text::BibTeX since that seems to be where it's taking all the time.

simifilm commented 9 years ago

Tried that, made no noticeable difference. BTW I see the same behavior on two different machines.

plk commented 9 years ago

Generate a module list on those machines with "cpan -l" and then do a diff against mine:

https://www.dropbox.com/s/6bv8e7smcnwtmf3/modlist.txt?dl=0

simifilm commented 9 years ago

I hope I did that correctly. Here's the diff file: https://dl.dropboxusercontent.com/u/35828/diff.txt

plk commented 9 years ago

Can you make a unified diff with the "-u" flag?

simifilm commented 9 years ago

There you go: https://dl.dropboxusercontent.com/u/35828/diff.txt

plk commented 9 years ago

Seems to be the change from Unicode::Normalize 1.17 to 1.18. Trying to find out why.

simifilm commented 9 years ago

I downgraded to Unicode::Normalize 1.17 by hand and it did indeed result in the speed I was used to.

plk commented 9 years ago

Yes, I spoke to the Unicode::Normalize author and he confirmed that 1.18 dropped the XS module due to some problematic perl 5.20 changes so the recommendation is for 1.17. I will change the biber Build.PL to force only 1.17 to be acceptable.