sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.32k stars 451 forks source link

Docbuilder hangs if latex fails #14626

Closed jdemeyer closed 11 years ago

jdemeyer commented 11 years ago

When building the PDF documentation, if there is problem while running latex, then the docbuilder just hangs forever after building all documentation. There is no obvious clue what the problem is apart from a message like the following (example from #9107) in the log file:

! LaTeX Error: Too deeply nested.

See the LaTeX manual or LaTeX Companion for explanation.
Type  H <return>  for immediate help.
 ...                                              

l.27819 \begin{Verbatim}[commandchars=\\\{\}]

? 
! Emergency stop.
 ...                                              

l.27819 \begin{Verbatim}[commandchars=\\\{\}]

!  ==> Fatal error occurred, no output PDF file produced!
Transcript written on categories.log.
make[1]: *** [categories.pdf] Error 1
make[1]: Leaving directory `/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/devel/sage-main/doc/output/latex/en/reference/categories'
Exception in thread Thread-6:
Traceback (most recent call last):
  File "/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/local/lib/python/threading.py", line 810, in __bootstrap_inner
    self.run()
  File "/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/local/lib/python/threading.py", line 763, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/local/lib/python/multiprocessing/pool.py", line 376, in _handle_results
    task = get()
TypeError: ('__init__() takes at least 3 arguments (1 given)', <class 'subprocess.CalledProcessError'>, ())

This hang is http://bugs.python.org/issue9400

Also: the docbuilder should use $MAKE instead of make.

Upstream: Reported upstream. Developers acknowledge bug.

CC: @jhpalmieri @nexttime

Component: documentation

Author: Jeroen Demeyer

Reviewer: John Palmieri

Merged: sage-5.10.beta5

Issue created by migration from https://trac.sagemath.org/ticket/14626

jdemeyer commented 11 years ago

Description changed:

--- 
+++ 
@@ -22,3 +22,5 @@
 Transcript written on categories.log.
 )make[1]: *** [categories.pdf] Error 1

+ +Also: the docbuilder should use $MAKE instead of make.

jdemeyer commented 11 years ago

Description changed:

--- 
+++ 
@@ -1,4 +1,4 @@
-When building the PDF documentation, if there is problem while running `latex`, then the docbuilder just hangs forever. There is no obvious clue what the problem is apart from a message like the following (example from #9107) in the log file:
+When building the PDF documentation, if there is problem while running `latex`, then the docbuilder just hangs forever *after building all documentation*. There is no obvious clue what the problem is apart from a message like the following (example from #9107) in the log file:

! LaTeX Error: Too deeply nested. @@ -10,9 +10,6 @@ l.27819 \begin{Verbatim}[commandchars=\{}]

? -Implicit mode ON; LaTeX internals redefined -(/usr/share/texmf-texlive/tex/latex/ltxmisc/url.sty -(/usr/share/texmf-texlive/tex/latex/base/t1enc.def) ! Emergency stop. ...

@@ -20,7 +17,19 @@

! ==> Fatal error occurred, no output PDF file produced! Transcript written on categories.log. -)make[1]: [categories.pdf] Error 1 +make[1]: [categories.pdf] Error 1 +make[1]: Leaving directory `/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/devel/sage-main/doc/output/latex/en/reference/categories' +Exception in thread Thread-6: +Traceback (most recent call last):

-Also: the docbuilder should use $MAKE instead of make. +This looks like a bug in the Python subprocess module. + +Also: the docbuilder should use $MAKE instead of make.

jdemeyer commented 11 years ago

Description changed:

--- 
+++ 
@@ -30,6 +30,6 @@
 TypeError: ('__init__() takes at least 3 arguments (1 given)', <class 'subprocess.CalledProcessError'>, ())

-This looks like a bug in the Python subprocess module. +This looks like a bug in the Python multiprocessing module.

Also: the docbuilder should use $MAKE instead of make.

jdemeyer commented 11 years ago
comment:4

I have an idea, patch possibly coming up...

jdemeyer commented 11 years ago
comment:6

Attachment: 14626_workaround.patch.gz

jdemeyer commented 11 years ago

Author: Jeroen Demeyer

jhpalmieri commented 11 years ago
comment:7

The patch makes a lot of sense at first glance, but I should test it to make sure. I'll try to get to it soon.

jhpalmieri commented 11 years ago
comment:9

With the patch and with bad LaTeX code, I see the hang occur earlier (soon after trying to build the bad document), but it still hangs.

jdemeyer commented 11 years ago
comment:10

John, it seems to work for me, so could you please send me the docpdf.log file?

jdemeyer commented 11 years ago
comment:11

When I apply this patch here and the patches from #9107 causing a LaTeX failure, then I get

! LaTeX Error: Too deeply nested.

See the LaTeX manual or LaTeX Companion for explanation.
Type  H <return>  for immediate help.
 ...

l.27819 \begin{Verbatim}[commandchars=\\\{\}]

?
! Emergency stop.
 ...

l.27819 \begin{Verbatim}[commandchars=\\\{\}]

!  ==> Fatal error occurred, no output PDF file produced!
Transcript written on categories.log.
make[1]: *** [categories.pdf] Error 1
make[1]: Leaving directory `/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/devel/sage-main/doc/output/latex/en/reference/categories'
Traceback (most recent call last):
  File "/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/devel/sage/doc/common/builder.py", line 1452, in <module>
    getattr(get_builder(name), type)()
  File "/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/devel/sage/doc/common/builder.py", line 273, in _wrapper
    getattr(get_builder(document), name)(*args, **kwds)
  File "/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/devel/sage/doc/common/builder.py", line 472, in _wrapper
    pool.map_async(build_ref_doc, L, 1).get(99999)
  File "/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/local/lib/python/multiprocessing/pool.py", line 554, in get
    raise self._value
RuntimeError: failed to run $MAKE all-pdf in /mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/devel/sage/doc/output/latex/en/reference/categories
make: *** [doc-pdf] Error 1

after which I get back into the shell as expected.

jdemeyer commented 11 years ago
comment:12

The cause of the crash seems to be a combination of:

  1. subprocess.CalledProcessError instances cannot be unpickled properly.
  2. The multiprocessing module uses pickles to transfer exceptions from the child process to the master process and apparently doesn't gracefully handle unpickling errors.
jdemeyer commented 11 years ago

Description changed:

--- 
+++ 
@@ -30,6 +30,6 @@
 TypeError: ('__init__() takes at least 3 arguments (1 given)', <class 'subprocess.CalledProcessError'>, ())

-This looks like a bug in the Python multiprocessing module. +This hang is http://bugs.python.org/issue9400

Also: the docbuilder should use $MAKE instead of make.

jdemeyer commented 11 years ago

Upstream: Reported upstream. Developers acknowledge bug.

jhpalmieri commented 11 years ago
comment:14

I mistakenly thought that I wasn't getting an error from the patches at #9107, so I made this change and then build the documentation:

diff --git a/sage/algebras/steenrod/steenrod_algebra.py b/sage/algebras/steenrod/steenrod_algebra.py
--- a/sage/algebras/steenrod/steenrod_algebra.py
+++ b/sage/algebras/steenrod/steenrod_algebra.py
@@ -10,6 +10,8 @@
   the Steenrod algebra using CombinatorialFreeModule; improved the
   test suite.

+Broken: `\aaaaaa`
+
 This module defines the mod `p` Steenrod algebra `\mathcal{A}_p`, some
 of its properties, and ways to define elements of it.

With the patch here, it hangs after trying to build reference/algebras. I agree that with just the patches at #9107, the hang is no longer present: once reference/categories fails, I get sent back to the shell.

jdemeyer commented 11 years ago
comment:15

John: your change still works for me:

! Undefined control sequence.
<recently read> \aaaaaa

l.4077 Broken: $\aaaaaa
                       $
?
! Emergency stop.
<recently read> \aaaaaa

l.4077 Broken: $\aaaaaa
                       $
!  ==> Fatal error occurred, no output PDF file produced!
Transcript written on algebras.log.
make[1]: *** [algebras.pdf] Error 1
make[1]: Leaving directory `/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/devel/sage-main/doc/output/latex/en/reference/algebras'
Traceback (most recent call last):
  File "/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/devel/sage/doc/common/builder.py", line 1452, in <module>
    getattr(get_builder(name), type)()
  File "/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/devel/sage/doc/common/builder.py", line 273, in _wrapper
    getattr(get_builder(document), name)(*args, **kwds)
  File "/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/devel/sage/doc/common/builder.py", line 472, in _wrapper
    pool.map_async(build_ref_doc, L, 1).get(99999)
  File "/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/local/lib/python/multiprocessing/pool.py", line 554, in get
    raise self._value
RuntimeError: failed to run $MAKE all-pdf in /mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/devel/sage/doc/output/latex/en/reference/algebras
make: *** [doc-pdf] Error 1

I correctly get a shell prompt after this.

Please attach your docpdf.log such that I can maybe find out what is happening.

jhpalmieri commented 11 years ago
comment:16

Sorry, once again I didn't communicate well enough. I've been running ./sage --docbuild reference pdf, which still exhibits the hang. I see now that running make doc-pdf works as you say (so I'm not going to bother attaching docpdf.log).

jdemeyer commented 11 years ago

Reviewer: John Palmieri

jdemeyer commented 11 years ago
comment:17

Also ./sage --docbuild reference pdf works for me...

! Undefined control sequence.
<recently read> \aaaaaa 

l.4066 Broken: $\aaaaaa
                       $
? 
! Emergency stop.
<recently read> \aaaaaa 

l.4066 Broken: $\aaaaaa
                       $
!  ==> Fatal error occurred, no output PDF file produced!
Transcript written on algebras.log.
make: *** [algebras.pdf] Error 1
Traceback (most recent call last):
  File "/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/devel/sage/doc/common/builder.py", line 1452, in <module>
    getattr(get_builder(name), type)()
  File "/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/devel/sage/doc/common/builder.py", line 472, in _wrapper
    pool.map_async(build_ref_doc, L, 1).get(99999)
  File "/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/local/lib/python/multiprocessing/pool.py", line 554, in get
    raise self._value
RuntimeError: failed to run $MAKE all-pdf in /mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/devel/sage/doc/output/latex/en/reference/algebras
jhpalmieri commented 11 years ago
comment:18

I've seen this repeatably while running ./sage --docbuild reference pdf on two different OS X machines (with two cores, with MAKE='make -j2'). Also, I just tried applying the patches at #9107 and the one here (without my change to steenrod_algebra.py) on sage.math (with MAKE='make -j12'), and it hangs after failing to compile categories.tex. (It finishes the compilations in progress, but then hangs).

jdemeyer commented 11 years ago
comment:19

John, I still cannot reproduce your problems, can you say the exact steps that you did.

I am doing the following on sage.math:

  1. Extract a Sage 5.9 binary:
jdemeyer@sage:/release$ tar xzf /home/release/sage-5.9/sage-5.9-boxen-x86_64-Linux.tar.gz
jdemeyer@sage:/release$ cd sage-5.9-boxen-x86_64-Linux
  1. Apply the patch:
jdemeyer@sage:/release/sage-5.9-boxen-x86_64-Linux$ ./sage --hg -R devel/sage qimport -P https://github.com/sagemath/sage-prod/files/10657829/14626_workaround.patch.gz
adding 14626_workaround.patch to series file
applying 14626_workaround.patch
now at: 14626_workaround.patch
  1. Break LaTeX:
jdemeyer@sage:/release/sage-5.9-boxen-x86_64-Linux$ ( cd devel/sage && patch -p1 )
diff --git a/sage/algebras/steenrod/steenrod_algebra.py b/sage/algebras/steenrod/steenrod_algebra.py

Index: sage/algebras/steenrod/steenrod_algebra.py
===================================================================
--- a/sage/algebras/steenrod/steenrod_algebra.py
+++ b/sage/algebras/steenrod/steenrod_algebra.py
@@ -10,5 +10,7 @@
   the Steenrod algebra using CombinatorialFreeModule; improved the
   test suite.

+Broken: `\aaaaaa`
+
 This module defines the mod `p` Steenrod algebra `\mathcal{A}_p`, some
 of its properties, and ways to define elements of it.
patching file sage/algebras/steenrod/steenrod_algebra.py
Hunk #1 succeeded at 10 with fuzz 1.
  1. Rebuild Sage:
jdemeyer@sage:/release/sage-5.9-boxen-x86_64-Linux$ ./sage -b

[...]

  1. Build the PDF reference manual using 12 threads:
jdemeyer@sage:/release/sage-5.9-boxen-x86_64-Linux$ env MAKE="make -j12" ./sage --docbuild reference pdf 2>&1 |tee docpdf.log

[...]

! Undefined control sequence.
<recently read> \aaaaaa 

l.3809 Broken: $\aaaaaa
                       $
? 
! Emergency stop.
<recently read> \aaaaaa 

l.3809 Broken: $\aaaaaa
                       $
!  ==> Fatal error occurred, no output PDF file produced!
Transcript written on algebras.log.
]
Adding blank page after the table of contents.
pdfTeX warning (ext4): destination with the same identifier (name{page.i}) has 
been already used, duplicate ignored
<to be read again> 
                   \relax 
l.129 \tableofcontents
                       [1 [28]]pdfTeX warning (ext4): destination with the same iden
tifier (name{page.ii}) has been already used, duplicate ignored
<to be read again> 
                   \relax 
l.129 \tableofcontents
                       [2make: *** [algebras.pdf] Error 1

[...]

Underfull \hbox (badness 10000) in paragraph at lines 2769--2772
[]\T1/ptm/m/n/10 WalshCode - a bi-nary lin-ear $\OT1/cmr/m/n/10 [2[]\OML/cmm/m/
it/10 ; m; \OT1/cmr/m/n/10 2[]]$ \T1/ptm/m/n/10 code re-lated to Hadamard ma-tr
i-ces.
[30][constants] reading sources... [100%] sage/symbolic/constants_c
 [7] [31Traceback (most recent call last):
  File "/release/sage-5.9-boxen-x86_64-Linux/devel/sage/doc/common/builder.py", line 1452, in <module>
]    getattr(get_builder(name), type)()
  File "/release/sage-5.9-boxen-x86_64-Linux/devel/sage/doc/common/builder.py", line 472, in _wrapper
    pool.map_async(build_ref_doc, L, 1).get(99999)
  File "/release/sage-5.9-boxen-x86_64-Linux/local/lib/python/multiprocessing/pool.py", line 528, in get
    raise self._value
RuntimeError: failed to run $MAKE all-pdf in /release/sage-5.9-boxen-x86_64-Linux/devel/sage/doc/output/latex/en/reference/algebras

[...]

Output written on arithgroup.pdf (85 pages, 458174 bytes).
Transcript written on arithgroup.log.
  1. I get back to the shell as expected.
jhpalmieri commented 11 years ago
comment:20

Okay, sorry, you're right. It looked to me as though it was hanging, but that's because the shell prompt was buried in output from the still-running processes. I stupidly didn't think to hit RET to see if I got a shell prompt.

At some point we might want to provide an error message at the very end, which won't get lost amidst the output from parallel processes, but that can go on another ticket.

jdemeyer commented 11 years ago
comment:21

Replying to @jhpalmieri:

Okay, sorry, you're right. It looked to me as though it was hanging, but that's because the shell prompt was buried in output from the still-running processes.

Do you remember the shell command that you ran (in particular, did you use any unusual redirections or piping)? Because otherwise I don't see how it can happen what you describe.

jdemeyer commented 11 years ago

Merged: sage-5.10.beta5

jhpalmieri commented 11 years ago
comment:22

I just logged into sage.math and did

$ cd /scratch/palmieri/sage-5.10.beta4
$ ./sage --docbuild reference pdf

Then I see, in the middle of a lot of output,

[32 [20 <pairing.png, id=620, 416.9979pt x 217.5327pt>
<use pairing.png>]] [68] <use pairing.png> [69 <./pairing.png [33] [21] [34]
Underfull \hbox (badness 10000) in paragraph at lines 2826--2827

[22][35]palmieri@boxen:sage-5.10.beta4$ 
Underfull \hbox (badness 10000) in paragraph at lines 2950--2951

 [23][36] [24]>] [70]
Chapter 11.

palmieri@boxen:sage-5.10.beta4$ is my shell prompt. At the end of the output:

Output written on homology.pdf (117 pages, 651759 bytes).
Transcript written on homology.log.

but no shell prompt because it was already printed earlier. With make doc-pdf, I see a proper error message at the end.

jdemeyer commented 11 years ago
comment:23

John: probably the "output after shell prompt" problem is caused by parallel docbuilding: it seems that, if one thread fails, the docbuilder master process exists and the other threads simply continue working...

Not really a bug, just a peculiarity of multiprocessing.Pool I guess.