gittup / tup

Tup is a file-based build system.
http://gittup.org/tup/
GNU General Public License v2.0
1.18k stars 146 forks source link

better (La)TeX support: allow circular dependencies #89

Open majewsky opened 12 years ago

majewsky commented 12 years ago

The problem with using LaTeX in make or tup is that it generates output files which are its own input, so running latex multiple times changes the output. The issue is that LaTeX cannot go back to pages it has already finished rendering.

For example, consider a large document that has a table of contents at the beginning. LaTeX's \tableofcontents commands reads chapters and sections from "foo.aux" (where "foo.tex" is the actual source file). On the subsequent pages, \chapter and \section commands write their location into "foo.aux" (which is truncated AFAIK when the first \chapter or \section is encountered).

So you need to run latex at least twice: once to fill foo.aux and once to read it. Worse yet, if the TOC is longer than a page, you actually need three runs: Inserting the TOC in the second run will change the page numbers of chapters and sections appearing thereafter, so foo.aux changes again.

The quick and dirty solution is

: foo.tex |> pdflatex %f && pdflatex %f && pdflatex %f |> %B.pdf | %B.aux

This is of course bad since it runs pdflatex three times everytime, even when it does not need to. A better solution is LaMake, a custom build tool I wrote some time ago, which checks if a pdflatex run changes the MD5 sum of the .aux file, and reruns pdflatex if it does.

Of course, I would like to use tup for the job, but as far as I can see, tup does not allow the required circular dependency of

build foo.tex -> check foo.aux -> rebuild foo.tex

Is there any way we can have that? For example:

: foo.tex |> pdflatex %f |> %B.pdf | %B.aux
: foo.aux |> checkaux.sh %f && tup mark-outdated %B.tex |>
gittup commented 11 years ago

On Mon, Dec 3, 2012 at 4:31 AM, Stefan Majewsky notifications@github.comwrote:

The problem with using LaTeX in make or tup is that it generates output files which are its own input, so running latex multiple times changes the output. The issue is that LaTeX cannot go back to pages it has already finished rendering.

For example, consider a large document that has a table of contents at the beginning. LaTeX's \tableofcontents commands reads chapters and sections from "foo.aux" (where "foo.tex" is the actual source file). On the subsequent pages, \chapter and \section commands write their location into "foo.aux" (which is truncated AFAIK when the first \chapter or \section is encountered).

So you need to run latex at least twice: once to fill foo.aux and once to read it. Worse yet, if the TOC is longer than a page, you actually need three runs: Inserting the TOC in the second run will change the page numbers of chapters and sections appearing thereafter, so foo.aux changes again.

The quick and dirty solution is

: foo.tex |> pdflatex %f && pdflatex %f && pdflatex %f |> %B.pdf | %B.aux

This is of course bad since it runs pdflatex three times everytime, even when it does not need to. A better solution is LaMakehttp://gitorious.org/lamake/lamake/, a custom build tool I wrote some time ago, which checks if a pdflatex run changes the MD5 sum of the .aux file, and reruns pdflatex if it does.

Of course, I would like to use tup for the job, but as far as I can see, tup does not allow the required circular dependency of

build foo.tex -> check foo.aux -> rebuild foo.tex

Is there any way we can have that? For example:

: foo.tex |> pdflatex %f |> %B.pdf | %B.aux : foo.aux |> checkaux.sh %f && tup mark-outdated %B.tex |>

I would much prefer if pdflatex just finished it's work on the first run. I know of no other program that is that lazy. Imagine if you had to run gcc over and over until the .o file stopped changing, or run 'ls' multiple times because it only listed half the files each time.

For my own .tex file I had been doing:

: esc.tex | {images} |> ^ LATEX %f^ for i in seq 1 2; do latex -interaction=batchmode -halt-on-error %f; done |> %B.dvi %B.log %B.aux : esc.dvi |> ^ DVIPDF %o^ dvipdf %f %o |> paper.pdf

But this just runs it twice always, which it sounds like you're saying is wrong in some cases and does too much work in others. Using your md5sum idea, can we just write a shell script that does:

$ cat mylatex.sh

! /bin/sh

output=basename $1 .tex.dvi latex -interaction=batchmode -halt-on-error $1

sum=md5sum $output while true; do latex -interaction=batchmode -halt-on-error $1 newsum=md5sum $output echo "Sum, newsum: $sum, $newsum" if [ "$sum" = "$newsum" ]; then exit 0 fi sum="$newsum" done

Here I'm using latex, but I assume the same idea works for pdflatex as well. Now I can write my rule like this:

: esc.tex | {images} |> ^ LATEX %f^ ./mylatex.sh %f |> %B.dvi %B.log %B.aux : esc.dvi |> ^ DVIPDF %o^ dvipdf %f %o |> paper.pdf

What do you think?

-Mike

droundy commented 10 years ago

I agree that handling of circular dependencies would be awesome (and magically make latex do the right thing). But also scary, since it sounds like something that could lead to infinite loops.

adzenith commented 9 years ago

GCC is also able to handle circular dependencies just fine.

a.h:

#include "b.h"

b.h:

#include "a.h"

gcc a.h: b.h:1:15: error: #include nested too deeply

Seems like latex should be able to figure it out. Looks like I'll sadly be sticking with Makefiles for now.

adzenith commented 9 years ago

I'm going to try http://users.phys.psu.edu/~collins/latexmk/ and see what I can see.

adzenith commented 9 years ago

I got it working with:

#we need to export HOME, because pdflatex uses it to figure out where ~/texmf is
export HOME
: foreach *.tex |> latexmk -bibtex -pdf %f |> %B.pdf | %B.aux %B.bbl %B.blg %B.fdb_latexmk %B.fls %B.lof %B.log %B.lot %B.toc

It's not a great solution, because latexmk is doing work that tup should be doing. I might spend some more time on it later, but thought I'd share this for now.

AndydeCleyre commented 1 month ago

Some other examples: