Closed jdemeyer closed 11 years ago
Description changed:
---
+++
@@ -22,3 +22,5 @@
Transcript written on categories.log.
)make[1]: *** [categories.pdf] Error 1
+
+Also: the docbuilder should use $MAKE
instead of make
.
Description changed:
---
+++
@@ -1,4 +1,4 @@
-When building the PDF documentation, if there is problem while running `latex`, then the docbuilder just hangs forever. There is no obvious clue what the problem is apart from a message like the following (example from #9107) in the log file:
+When building the PDF documentation, if there is problem while running `latex`, then the docbuilder just hangs forever *after building all documentation*. There is no obvious clue what the problem is apart from a message like the following (example from #9107) in the log file:
! LaTeX Error: Too deeply nested. @@ -10,9 +10,6 @@ l.27819 \begin{Verbatim}[commandchars=\{}]
? -Implicit mode ON; LaTeX internals redefined -(/usr/share/texmf-texlive/tex/latex/ltxmisc/url.sty -(/usr/share/texmf-texlive/tex/latex/base/t1enc.def) ! Emergency stop. ...
@@ -20,7 +17,19 @@
! ==> Fatal error occurred, no output PDF file produced! Transcript written on categories.log. -)make[1]: [categories.pdf] Error 1 +make[1]: [categories.pdf] Error 1 +make[1]: Leaving directory `/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/devel/sage-main/doc/output/latex/en/reference/categories' +Exception in thread Thread-6: +Traceback (most recent call last):
-Also: the docbuilder should use $MAKE
instead of make
.
+This looks like a bug in the Python subprocess
module.
+
+Also: the docbuilder should use $MAKE instead of make.
Description changed:
---
+++
@@ -30,6 +30,6 @@
TypeError: ('__init__() takes at least 3 arguments (1 given)', <class 'subprocess.CalledProcessError'>, ())
-This looks like a bug in the Python subprocess
module.
+This looks like a bug in the Python multiprocessing
module.
Also: the docbuilder should use $MAKE instead of make.
I have an idea, patch possibly coming up...
Attachment: 14626_workaround.patch.gz
Author: Jeroen Demeyer
The patch makes a lot of sense at first glance, but I should test it to make sure. I'll try to get to it soon.
With the patch and with bad LaTeX code, I see the hang occur earlier (soon after trying to build the bad document), but it still hangs.
John, it seems to work for me, so could you please send me the docpdf.log
file?
When I apply this patch here and the patches from #9107 causing a LaTeX failure, then I get
! LaTeX Error: Too deeply nested.
See the LaTeX manual or LaTeX Companion for explanation.
Type H <return> for immediate help.
...
l.27819 \begin{Verbatim}[commandchars=\\\{\}]
?
! Emergency stop.
...
l.27819 \begin{Verbatim}[commandchars=\\\{\}]
! ==> Fatal error occurred, no output PDF file produced!
Transcript written on categories.log.
make[1]: *** [categories.pdf] Error 1
make[1]: Leaving directory `/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/devel/sage-main/doc/output/latex/en/reference/categories'
Traceback (most recent call last):
File "/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/devel/sage/doc/common/builder.py", line 1452, in <module>
getattr(get_builder(name), type)()
File "/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/devel/sage/doc/common/builder.py", line 273, in _wrapper
getattr(get_builder(document), name)(*args, **kwds)
File "/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/devel/sage/doc/common/builder.py", line 472, in _wrapper
pool.map_async(build_ref_doc, L, 1).get(99999)
File "/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/local/lib/python/multiprocessing/pool.py", line 554, in get
raise self._value
RuntimeError: failed to run $MAKE all-pdf in /mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/devel/sage/doc/output/latex/en/reference/categories
make: *** [doc-pdf] Error 1
after which I get back into the shell as expected.
The cause of the crash seems to be a combination of:
subprocess.CalledProcessError
instances cannot be unpickled properly.multiprocessing
module uses pickles to transfer exceptions from the child process to the master process and apparently doesn't gracefully handle unpickling errors.Description changed:
---
+++
@@ -30,6 +30,6 @@
TypeError: ('__init__() takes at least 3 arguments (1 given)', <class 'subprocess.CalledProcessError'>, ())
-This looks like a bug in the Python multiprocessing
module.
+This hang is http://bugs.python.org/issue9400
Also: the docbuilder should use $MAKE instead of make.
Upstream: Reported upstream. Developers acknowledge bug.
I mistakenly thought that I wasn't getting an error from the patches at #9107, so I made this change and then build the documentation:
diff --git a/sage/algebras/steenrod/steenrod_algebra.py b/sage/algebras/steenrod/steenrod_algebra.py
--- a/sage/algebras/steenrod/steenrod_algebra.py
+++ b/sage/algebras/steenrod/steenrod_algebra.py
@@ -10,6 +10,8 @@
the Steenrod algebra using CombinatorialFreeModule; improved the
test suite.
+Broken: `\aaaaaa`
+
This module defines the mod `p` Steenrod algebra `\mathcal{A}_p`, some
of its properties, and ways to define elements of it.
With the patch here, it hangs after trying to build reference/algebras. I agree that with just the patches at #9107, the hang is no longer present: once reference/categories fails, I get sent back to the shell.
John: your change still works for me:
! Undefined control sequence.
<recently read> \aaaaaa
l.4077 Broken: $\aaaaaa
$
?
! Emergency stop.
<recently read> \aaaaaa
l.4077 Broken: $\aaaaaa
$
! ==> Fatal error occurred, no output PDF file produced!
Transcript written on algebras.log.
make[1]: *** [algebras.pdf] Error 1
make[1]: Leaving directory `/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/devel/sage-main/doc/output/latex/en/reference/algebras'
Traceback (most recent call last):
File "/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/devel/sage/doc/common/builder.py", line 1452, in <module>
getattr(get_builder(name), type)()
File "/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/devel/sage/doc/common/builder.py", line 273, in _wrapper
getattr(get_builder(document), name)(*args, **kwds)
File "/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/devel/sage/doc/common/builder.py", line 472, in _wrapper
pool.map_async(build_ref_doc, L, 1).get(99999)
File "/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/local/lib/python/multiprocessing/pool.py", line 554, in get
raise self._value
RuntimeError: failed to run $MAKE all-pdf in /mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/devel/sage/doc/output/latex/en/reference/algebras
make: *** [doc-pdf] Error 1
I correctly get a shell prompt after this.
Please attach your docpdf.log
such that I can maybe find out what is happening.
Sorry, once again I didn't communicate well enough. I've been running ./sage --docbuild reference pdf
, which still exhibits the hang. I see now that running make doc-pdf
works as you say (so I'm not going to bother attaching docpdf.log).
Reviewer: John Palmieri
Also ./sage --docbuild reference pdf
works for me...
! Undefined control sequence.
<recently read> \aaaaaa
l.4066 Broken: $\aaaaaa
$
?
! Emergency stop.
<recently read> \aaaaaa
l.4066 Broken: $\aaaaaa
$
! ==> Fatal error occurred, no output PDF file produced!
Transcript written on algebras.log.
make: *** [algebras.pdf] Error 1
Traceback (most recent call last):
File "/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/devel/sage/doc/common/builder.py", line 1452, in <module>
getattr(get_builder(name), type)()
File "/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/devel/sage/doc/common/builder.py", line 472, in _wrapper
pool.map_async(build_ref_doc, L, 1).get(99999)
File "/mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/local/lib/python/multiprocessing/pool.py", line 554, in get
raise self._value
RuntimeError: failed to run $MAKE all-pdf in /mazur/release/sage-5.10.beta4-boxen-x86_64-Linux/devel/sage/doc/output/latex/en/reference/algebras
I've seen this repeatably while running ./sage --docbuild reference pdf
on two different OS X machines (with two cores, with MAKE='make -j2'
). Also, I just tried applying the patches at #9107 and the one here (without my change to steenrod_algebra.py) on sage.math (with MAKE='make -j12'
), and it hangs after failing to compile categories.tex. (It finishes the compilations in progress, but then hangs).
John, I still cannot reproduce your problems, can you say the exact steps that you did.
I am doing the following on sage.math
:
jdemeyer@sage:/release$ tar xzf /home/release/sage-5.9/sage-5.9-boxen-x86_64-Linux.tar.gz
jdemeyer@sage:/release$ cd sage-5.9-boxen-x86_64-Linux
jdemeyer@sage:/release/sage-5.9-boxen-x86_64-Linux$ ./sage --hg -R devel/sage qimport -P https://github.com/sagemath/sage-prod/files/10657829/14626_workaround.patch.gz
adding 14626_workaround.patch to series file
applying 14626_workaround.patch
now at: 14626_workaround.patch
jdemeyer@sage:/release/sage-5.9-boxen-x86_64-Linux$ ( cd devel/sage && patch -p1 )
diff --git a/sage/algebras/steenrod/steenrod_algebra.py b/sage/algebras/steenrod/steenrod_algebra.py
Index: sage/algebras/steenrod/steenrod_algebra.py
===================================================================
--- a/sage/algebras/steenrod/steenrod_algebra.py
+++ b/sage/algebras/steenrod/steenrod_algebra.py
@@ -10,5 +10,7 @@
the Steenrod algebra using CombinatorialFreeModule; improved the
test suite.
+Broken: `\aaaaaa`
+
This module defines the mod `p` Steenrod algebra `\mathcal{A}_p`, some
of its properties, and ways to define elements of it.
patching file sage/algebras/steenrod/steenrod_algebra.py
Hunk #1 succeeded at 10 with fuzz 1.
jdemeyer@sage:/release/sage-5.9-boxen-x86_64-Linux$ ./sage -b
[...]
jdemeyer@sage:/release/sage-5.9-boxen-x86_64-Linux$ env MAKE="make -j12" ./sage --docbuild reference pdf 2>&1 |tee docpdf.log
[...]
! Undefined control sequence.
<recently read> \aaaaaa
l.3809 Broken: $\aaaaaa
$
?
! Emergency stop.
<recently read> \aaaaaa
l.3809 Broken: $\aaaaaa
$
! ==> Fatal error occurred, no output PDF file produced!
Transcript written on algebras.log.
]
Adding blank page after the table of contents.
pdfTeX warning (ext4): destination with the same identifier (name{page.i}) has
been already used, duplicate ignored
<to be read again>
\relax
l.129 \tableofcontents
[1 [28]]pdfTeX warning (ext4): destination with the same iden
tifier (name{page.ii}) has been already used, duplicate ignored
<to be read again>
\relax
l.129 \tableofcontents
[2make: *** [algebras.pdf] Error 1
[...]
Underfull \hbox (badness 10000) in paragraph at lines 2769--2772
[]\T1/ptm/m/n/10 WalshCode - a bi-nary lin-ear $\OT1/cmr/m/n/10 [2[]\OML/cmm/m/
it/10 ; m; \OT1/cmr/m/n/10 2[]]$ \T1/ptm/m/n/10 code re-lated to Hadamard ma-tr
i-ces.
[30][constants] reading sources... [100%] sage/symbolic/constants_c
[7] [31Traceback (most recent call last):
File "/release/sage-5.9-boxen-x86_64-Linux/devel/sage/doc/common/builder.py", line 1452, in <module>
] getattr(get_builder(name), type)()
File "/release/sage-5.9-boxen-x86_64-Linux/devel/sage/doc/common/builder.py", line 472, in _wrapper
pool.map_async(build_ref_doc, L, 1).get(99999)
File "/release/sage-5.9-boxen-x86_64-Linux/local/lib/python/multiprocessing/pool.py", line 528, in get
raise self._value
RuntimeError: failed to run $MAKE all-pdf in /release/sage-5.9-boxen-x86_64-Linux/devel/sage/doc/output/latex/en/reference/algebras
[...]
Output written on arithgroup.pdf (85 pages, 458174 bytes).
Transcript written on arithgroup.log.
Okay, sorry, you're right. It looked to me as though it was hanging, but that's because the shell prompt was buried in output from the still-running processes. I stupidly didn't think to hit RET to see if I got a shell prompt.
At some point we might want to provide an error message at the very end, which won't get lost amidst the output from parallel processes, but that can go on another ticket.
Replying to @jhpalmieri:
Okay, sorry, you're right. It looked to me as though it was hanging, but that's because the shell prompt was buried in output from the still-running processes.
Do you remember the shell command that you ran (in particular, did you use any unusual redirections or piping)? Because otherwise I don't see how it can happen what you describe.
Merged: sage-5.10.beta5
I just logged into sage.math and did
$ cd /scratch/palmieri/sage-5.10.beta4
$ ./sage --docbuild reference pdf
Then I see, in the middle of a lot of output,
[32 [20 <pairing.png, id=620, 416.9979pt x 217.5327pt>
<use pairing.png>]] [68] <use pairing.png> [69 <./pairing.png [33] [21] [34]
Underfull \hbox (badness 10000) in paragraph at lines 2826--2827
[22][35]palmieri@boxen:sage-5.10.beta4$
Underfull \hbox (badness 10000) in paragraph at lines 2950--2951
[23][36] [24]>] [70]
Chapter 11.
palmieri@boxen:sage-5.10.beta4$
is my shell prompt. At the end of the output:
Output written on homology.pdf (117 pages, 651759 bytes).
Transcript written on homology.log.
but no shell prompt because it was already printed earlier. With make doc-pdf
, I see a proper error message at the end.
John: probably the "output after shell prompt" problem is caused by parallel docbuilding: it seems that, if one thread fails, the docbuilder master process exists and the other threads simply continue working...
Not really a bug, just a peculiarity of multiprocessing.Pool
I guess.
When building the PDF documentation, if there is problem while running
latex
, then the docbuilder just hangs forever after building all documentation. There is no obvious clue what the problem is apart from a message like the following (example from #9107) in the log file:This hang is http://bugs.python.org/issue9400
Also: the docbuilder should use $MAKE instead of make.
Upstream: Reported upstream. Developers acknowledge bug.
CC: @jhpalmieri @nexttime
Component: documentation
Author: Jeroen Demeyer
Reviewer: John Palmieri
Merged: sage-5.10.beta5
Issue created by migration from https://trac.sagemath.org/ticket/14626