shiblon / latex-makefile

A Makefile for LaTeX - drop it in, type make, and magic happens.
Other
185 stars 30 forks source link

Make hangs the third time after running make clean #113

Closed shiblon closed 8 years ago

shiblon commented 8 years ago

Originally reported on Google Code with ID 100

I think there might be a recent regression in the Makefile.  I started noticing today
that make will occasionally hang.  Top shows that make is working away busily:

 8647 amcnabb   20   0  100m 1172  792 R 100.0  0.0   2:57.70 make

This seems to only happen when the file is being rebuilt (i.e., make clean had _not_
just been run).  If I interrupt make after it hangs, it will hang again the next time
I run it.  However, running make clean seems to temporarily fix things.  Specifically,
after each clean, building with make will work twice, and then the third invocation
will hang.

I'm posting a few files that are a fairly minimal reproducing case.  I've found that
if I remove the "epsfig" package or the "IEEEtran" bibliography style or the citation
or the figure that the problem seems to stop.  Weird.

Reported by amcnabb8 on 2010-11-09 19:51:24


shiblon commented 8 years ago
epsfig is broken with this makefile and is simply not supported.  I have no intention
of supporting it, either, since graphicx has largely replaced it for many years, now.
 Since removing epsfig fixes the problem, marking WontFix.

Reported by shiblon on 2010-11-11 14:55:23

shiblon commented 8 years ago
How about an error: "Hey, you used the epsfig package.  Get rid of it--it's evil." or
something like that? :)  I ran into this because I was compiling a tex file that was
started by someone else--I personally haven't used epsfig in a long time.  However,
tracking down the infinite loop was a fairly tedious experience.

Reported by amcnabb8 on 2010-11-11 17:41:28

shiblon commented 8 years ago
By the way, I just replaced "epsfig" with "graphicx" in the file, and it's still hanging.
:(

Reported by amcnabb8 on 2010-11-11 17:44:32

shiblon commented 8 years ago
OK.  I can't look into this right now because it is going to involve some lengthy triage.
 I'll leave it as "New" until we can look at it again later.  If you end up discovering
anything else useful in the course of your work, please do post it :)

Reported by shiblon on 2010-11-11 17:47:02

shiblon commented 8 years ago
I'm now seeing this for a file where I'm not using epsfig, and now make starts using
100% CPU on the first invocation.  I'm having a hard time figuring out what might be
causing it.

Reported by amcnabb8 on 2010-12-21 18:08:35

shiblon commented 8 years ago
This can happen if, for example, sed is run incorrectly and expects input from the terminal.
 Does Ctrl-D do anything in the hang case?

Also, very slow runs can be caused by LANG != C settings.  I try to force that issue
in the Makefile, but you might try doing

LANG=C make

and seeing if that makes things any faster.

Sed in particular gets very confused when it encounters international characters unless
it has that environment set.

Reported by shiblon@google.com on 2010-12-21 18:35:22

shiblon commented 8 years ago
Hitting CTRL-D does not seem to help (and make is using 100% of the CPU, not just stalling
for input).

I just ran "LANG=C make", but this didn't seem to help.

Is there anything else I can try?  The VERBOSE mode didn't give any output, so I'm
not sure what else to look at.

Thanks!

Reported by amcnabb8 on 2010-12-21 19:17:45

shiblon commented 8 years ago
you can try -d to get make debug output (stuff about dependency resolution, which probably
doesn't help you). 

You can also set 
make SHELL_DEBUG=1
to get info on what the shell is doing.  That is probably going to be the most informative
in your case.

Of course, the options can be combined, as well.

Reported by shiblon on 2010-12-22 03:33:08

shiblon commented 8 years ago
Sorry for not responding (I had been out of town).  Anyway, I tried SHELL_DEBUG earlier
even though I forgot to mention it in my post.  The output wasn't particularly helpful:

+ which tput
+ /usr/bin/tput setaf 0
+ /usr/bin/tput setaf 1
+ /usr/bin/tput setaf 2
+ /usr/bin/tput setaf 3
+ /usr/bin/tput setaf 4
+ /usr/bin/tput setaf 5
+ /usr/bin/tput setaf 6
+ /usr/bin/tput setaf 7
+ /usr/bin/tput bold
+ /usr/bin/tput smul
+ /usr/bin/tput sgr0

As far as I can tell, it's just doing ASCII stuff.

However, the "make -d" output seems much more informative.  In particular, the last
thing before hanging is:

Reading makefile `sphere50_sweep_complete'\ no\ t\ found.\ See\ the\ LaTeX\ manual\
or\ LaTeX\ Companion\ for\ explanation.\ Type\ H\ <return>\ for\ immediate\ help.\
...\ l.309\ ...width=\figwidth]{sphere50_sweep_complete}\ I\ could\ not\ locate\ the\
file\ with\ any\ of\ these\ extensions:\ .png,.pdf,.jpg,.mps,.jpeg,.jbig2,.jb2.PNG,.PDF,.JPG,.JPEG,.JBIG2,.JB2\
Try\ typing\ <return>\ to\ proceed.\ If\ that\ doesn't\ work,\ type\ X\ <return>\ to\
quit.\ .gpi.d' (search path) (don't care) (no ~ expansion)...

It looks like there's a parsing problem: it looks like there's a missing file, and
instead of spitting out a warning and continuing, it's getting confused while trying
to open a non-existent file whose name corresponds to the error message.  So, I can
work around this by making sure that the file exists, and I think this is probably
enough information to track down the bug.

Reported by amcnabb8 on 2010-12-31 19:17:47

shiblon commented 8 years ago
One clarification: in this case the gpi file does exist, but the associated pdf has
not yet been generated.

Reported by amcnabb8 on 2010-12-31 19:21:06

shiblon commented 8 years ago
And if I change the source file to say \includegraphics{sphere50_sweep_complete.pdf}
where it used to say \includegraphics{sphere50_sweep_complete}, this seems to help.
 If the extension is mandatory for the makefile to work, maybe it would be sufficient
to detect that there is no extension and to print an error.

Reported by amcnabb8 on 2010-12-31 19:27:38

shiblon commented 8 years ago
Aha!  We have a sed script bug.  If you would be so kind as to package up all of your
.log and .d files, that would be most informative.  What's happening is the .d file
that contains graphics dependencies is trying to include a corresponding .gpi.d file,
but it is getting the filename wrong.  If you look in the various .d files, you'll
see that erroneous filename as a dependency in there, which means that dependency extraction
is broken.

Anyway, the .log and .d files would really help, here.

Reported by shiblon@google.com on 2010-12-31 19:32:13

shiblon commented 8 years ago
I assume you would like them with the ".pdf" missing from the \includegraphics line,
right?

Reported by amcnabb8 on 2010-12-31 19:40:32

shiblon commented 8 years ago
Either way.  I can fiddle.

Reported by shiblon on 2010-12-31 19:44:19

shiblon commented 8 years ago
Okay, I'm attaching logs.tar.bz2, which are the logs corresponding to a run with the
".pdf" missing from the line (things actually seemed worse when I added it).

Reported by amcnabb8 on 2010-12-31 19:46:47


shiblon commented 8 years ago
That helped!  I found the culprit.  LaTeX, when it has long filenames, splits error
lines at arbitrary places (not natural ones), so you have to delete newlines within
each paragraph, not just replace them with spaces as I was doing.  It's a one-line
fix, but the changes to the test files were more substantial.

So, there are now golden and test files for this case (I haven't put together the test
harness yet, but at least we'll know when something breaks), and this particular bug
should be fixed for you, now.

rc12a0bbd8f48

Reported by shiblon@google.com on 2010-12-31 20:23:53

shiblon commented 8 years ago
Ouch.  That's a crazy problem.

I just pulled, and it looks like there was a regression:

thesis.d:109: *** missing separator.  Stop.

I'll attach thesis.d in case it helps.

Reported by amcnabb8 on 2010-12-31 20:26:50


shiblon commented 8 years ago
Not a regression, just not a complete fix.  I'm working on it now.

Reported by shiblon on 2010-12-31 20:31:22

shiblon commented 8 years ago
Fixed.  r3d66702f1bc4

That was even nastier.  See comments for morbid details.

Reported by shiblon on 2010-12-31 20:51:37

shiblon commented 8 years ago
Wow.  I'm impressed I hit such an odd corner case right away.

Reported by amcnabb8 on 2010-12-31 20:55:05

shiblon commented 8 years ago
I think there might be one last problem.  It's now correctly turning "ring_iters_rast.gpi"
into "ring_iters_rast.pdf", but a few lines later, it gives the error:

./largeswarms._include_.tex:719: LaTeX Error: File `ring_iters_rast' not found.
make: *** [thesis.pdf] Error 1

As far as I can tell, the Makefile built a bunch of pdf files but then neglects to
re-run pdflatex before giving an error.  If I run pdflatex by hand (or re-run make),
the file gets built without any errors, so it looks like this error is from the previous
invocation of pdflatex.  This might be clearer with some sort of log file, but I'm
not sure what would be the most helpful.

Reported by amcnabb8 on 2010-12-31 21:07:39

shiblon commented 8 years ago
Are you only having problems with this one graphics file, or all of them?  It would
help a lot if I could reproduce this over here.

Have you tried doing a clean build first, then running the new makefile version?  Also,
I assume you ran "./build" before doing "make", to get the new Makefile created from
the fixed sed scripts...

Reported by shiblon on 2010-12-31 21:31:21

shiblon commented 8 years ago
I'll answer your questions in reverse order.  Yes, I remembered to run ./build, clean,
etc., but it never hurts to ask.

It seems to be all of the graphics files.  Let me explain by showing a bit more output
from make (with some ellipses to keep it from being overwhelming):

= thesis.tex --> thesis.d thesis.pdf.1st.make (0-1) =
= comm_rast.gpi --> comm_rast.gpi.d =
...
= rbfruntime.gpi --> rbfruntime.gpi.d =
= ../../bib/ai/bib.bib ../../bib/aml/bib.bib ../../bib/mapreduce/bib.bib ../../bib/math/bib.bib
../../bib/nfl/bib.bib ../../bib/parallel/bib.bib ../../bib/pso/general/bib.bib ../../bib/pso/parallel/bib.bib
../../bib/pso/topology/bib.bib thesis.aux --> thesis.bbl =
= thesis.tex --> thesis.d thesis.pdf.1st.make (2-1) =
= rbfruntime.gpi rbfruntime.gpi.d global-gpi.sed --> rbfruntime.pdf =
...
= comm_rast.gpi comm_rast.gpi.d global-gpi.sed --> comm_rast.pdf =
./largeswarms._include_.tex:719: LaTeX Error: File `ring_iters_rast' not found.
make: *** [thesis.pdf] Error 1

So if we look at the steps that are happening:

1) run pdflatex and generate thesis.d
2) generate *.gpi.d
3) generate thesis.bbl
4) rerun pdflatex and generate thesis.d
5) make *.pdf from all of the gpi files

At this point, there should be a step to rerun pdflatex again, but that doesn't seem
to be happening.

Reported by amcnabb8 on 2010-12-31 21:45:06

shiblon commented 8 years ago
Weird.  I'll have to tackle this later.  I've already made several people grumpy by
working on this today...

I'd be interested to see what you find from make -d.  It would appear that the presence
of a new .d file is not triggering a rebuild the way it should, which is very curious.
 It may have something to do with the inclusion of multiple .bib files, which is not
something overtly supported (yet).  The presence of a bibliography definitely has an
impact on build order, so that is very suspicious.

Reported by shiblon on 2010-12-31 22:00:00

shiblon commented 8 years ago
Sorry for my part in the grumpiness you experienced. :)  And thank you _very_ much for
helping out so much today.

I just tried "make -d", and at the bottom was the following:

= comm_rast.gpi comm_rast.gpi.d global-gpi.sed --> comm_rast.pdf =
Reaping winning child 0x1e025f0 PID 32033 
Live child 0x1e025f0 (comm_rast.pdf) PID 32034 
Reaping winning child 0x1e025f0 PID 32034 
Removing child 0x1e025f0 PID 32034 from chain.
    Successfully remade target file `comm_rast.pdf'.
   Finished prerequisites of target file `thesis.pdf'.
  Must remake target `thesis.pdf'.
Invoking recipe from Makefile:2530 to update target `thesis.pdf'.
Putting child 0x1ddd220 (thesis.pdf) PID 32039 on the chain.
Live child 0x1ddd220 (thesis.pdf) PID 32039 
./largeswarms._include_.tex:719: LaTeX Error: File `ring_iters_rast' not found.
Reaping losing child 0x1ddd220 PID 32039 
make: *** [thesis.pdf] Error 1
Removing child 0x1ddd220 PID 32039 from chain.

So it looks like it's trying to update "thesis.pdf".  When I added SHELL_DEBUG=1, I
found that the process being created is the big long sed command that reads thesis.log.
 So it's updating "thesis.pdf" without calling pdflatex.

Since you're busy, I'll try to look for a few minutes, but I don't have much confidence
that I'll find the problem.  Thanks again for your help.

Reported by amcnabb8 on 2010-12-31 22:08:58

shiblon commented 8 years ago
Okay, I've looked at it some more, and I understand things better than I used to, but
I'm still a little too stuck to make progress.  So here's a summary of what I [think
I] learned.

There are two different rules for generating a pdf (one for when it knows it will have
to run again and one for when it thinks maybe this is the last time?).  The rule for
the final run doesn't seem to call get-graphics, so I wasn't quite able to figure out
how it decides if it actually needs to be re-run.  Anyway, I looked at the log-file,
and the graphics file it's dying on has the double-newline case that you found earlier.
 I think that colorize-latex-errors.sed might be getting confused by this; it doesn't
seem to have any special logic for double-newlines, and I think it might be flagging
the missing graphics file as a fatal error.

Reported by amcnabb8 on 2010-12-31 23:01:45

shiblon commented 8 years ago
I think you're onto something here.  It looks like it is parsing the graphics stuff
as an error in the same case I tried to catch in the get-graphics routine.  Fun.

Looks like r3d79aab93d7d fixes this.  I should have noticed it in the test output,
frankly, since it was wrong and sitting right in front of me.

Thanks a bunch for the excellent triage work.  That helps more than you know.  :-)

Reported by shiblon on 2011-01-01 00:10:40

shiblon commented 8 years ago
And thanks for the recent reorganization (and helpful comments) that made this triage
possible.  I may not be able to provide working fixes yet, but at least it's an improvement.
:)

Reported by amcnabb8 on 2011-01-01 00:14:11

shiblon commented 8 years ago
By the way, the way that it re-runs is this:

When make has an "include" directive, it will trigger a full parse/run of the makefile
if it detects that the included file is changed.  We rely on this logic for included
generated graphics.  We create the .d file, which has the graphics dependencies in
it, and that triggers a new make invocation.  It then runs and nothing changes in the
.d file, but the .pdf is moved out of the way so that the more complicated looping
second rule kicks in.  It's a bit tricky, to be sure, but it allows us to build things
minimally (i.e., only running latex once when there are no fatal errors and no graphics
are missing), where if we didn't do it this way it would always run at least twice,
once to generate dependencies, and once to actually build the .pdf file.

Crazy stuff.  Enjoy!

Reported by shiblon on 2011-01-01 00:14:21

shiblon commented 8 years ago
Oops:

/bin/sh: command substitution: line 1: syntax error near unexpected token `('
/bin/sh: command substitution: line 1: `.*\)/{' -e '  /\n\n$/{' -e '    s/^::0:://'
-e '    b needonemore' -e '  }' -e '  s/::0::!!! //' -e '  /could not locate.*any of
these extensions:/{' -e '    d' -e '  }' -e '  s/\(not found\.\).*/\1/' -e '  b error'
-e '}' -e '/^\(.* LaTeX Error: Missing .begin.document.\.\).*/{' -e '  s//\1 --- Are
you trying to build an include file?/' -e '  b error' -e '}' -e '/^\(!!! .*Undefined
control sequence\)[^[:cntrl:]]*\(.*\)/{' -e '  s//\1: \2/' -e '  s/\nl\.[[:digit:]][^[:cntrl:]]*\(\[^\[:cntrl:]]*\).*/\1/'
-e '  b error' -e '}' -e '/^\(!pdfTeX error:.*\)s*/{' -e '  b error' -e '}' -e '/^\(!!!
[^[:cntrl:]]*\).*/{' -e '  s//\1.  See log for more information./' -e '  b error' -e
'}' -e '/.*\n\(!!! [^[:cntrl:]]*\).*/{' -e '  s//\1.  See log for more information./'
-e '  b error' -e '}' -e 'd' -e ':error' -e 's/^!\(!! \)\{0,1\}\(.*\)/\2/' -e 'p' -e
'd' 'thesis.log''
/bin/sh: command substitution: line 1: unexpected EOF while looking for matching `''
/bin/sh: command substitution: line 2: syntax error: unexpected end of file
/bin/sh: line 1: /nn$/{ -e : No such file or directory

Reported by amcnabb8 on 2011-01-01 00:16:08

shiblon commented 8 years ago
Thanks for the explanation in comment #29.  I'm not quite sure I could follow it in
the code, but at least I understand it better now.

Reported by amcnabb8 on 2011-01-01 00:17:51

shiblon commented 8 years ago
I see what happened, and I even reproduced it once, but I can't seem to reproduce it
again.

Check your Makefile and search for /\\n\\n\$/

What you'll find is that there are lines that look like this:

/\n\n$/{

They really should look like this

/\n\n$$/{

(Note the double $)

Try running ./build again.  I can't see why this would happen, but it did happen to
me a second ago, inexplicably.

Reported by shiblon on 2011-01-01 00:35:31

shiblon commented 8 years ago
In the Makefile they're all double $.  I think the problem is when it writes these to
an include file.  I tried doing make clean and the problem is still there.

Reported by amcnabb8 on 2011-01-01 00:39:26

shiblon commented 8 years ago
Except that this doesn't seem to be in any of the .make files, so I'm confused.

Reported by amcnabb8 on 2011-01-01 00:41:26

shiblon commented 8 years ago
Okay, perhaps seeing more of  shell debug output would help.  I'm not seeing what I
think is the root cause yet.

Reported by shiblon on 2011-01-01 00:44:39

shiblon commented 8 years ago
I'm not actually seeing any more detail with SHELL_DEBUG.  It kind of looks like the
error is a mismatched quote.  But I'm not seeing any quotes in the diff, so that doesn't
make sense either.

Reported by amcnabb8 on 2011-01-01 00:56:25

shiblon commented 8 years ago
I think it might have to do with the backtick mark "`".

Can you try escaping the ` in your makefile and tell me if that helps?  Alternatively,
you could try removing them completely from colorize-latex-errors.sed (they aren't
terribly critical).

Reported by shiblon on 2011-01-01 01:04:34

shiblon commented 8 years ago
It looks like that worked. Unfortunately, I have to run, so I probably can't do any
more testing today.  But anyway, it looks like getting rid of the two backticks in
colorize-latex-errors made the problem disappear.

Reported by amcnabb8 on 2011-01-01 01:06:56

shiblon commented 8 years ago
Cool.  Fix is in r023d6e7c75e6.

Thanks for your help.  We can tackle this next week if there are other issues.

Reported by shiblon on 2011-01-01 01:10:27

shiblon commented 8 years ago
Looks great.  Thanks!

Reported by amcnabb8 on 2011-01-01 21:21:30

shiblon commented 8 years ago
Issue 99 has been merged into this issue.

Reported by shiblon on 2011-01-04 14:26:42