Closed SergejKr closed 1 month ago
I would expect that the parallelization works without much set up. Is this a bug or the expected behaviour?
This is a bug.
I have no idea why it doesn't work, though. I can reproduce the problem with your example, but I also tried a different example (https://github.com/AudioSceneDescriptionFormat/splines) where parallelization worked fine!
That's really strange ...
I tried to build the docs of the splines package on a fresh virtual enviroment. I got an error in the compilation of the documentation at about 70%. I guess the versions of the packages I have used do not work with it. During the compilation I did not have the impression that it is running in parallel.
alabaster 0.7.16
asttokens 2.4.1
attrs 23.2.0
Babel 2.15.0
beautifulsoup4 4.12.3
bleach 6.1.0
certifi 2024.7.4
charset-normalizer 3.3.2
colorama 0.4.6
comm 0.2.2
contourpy 1.2.1
cycler 0.12.1
debugpy 1.8.2
decorator 5.1.1
defusedxml 0.7.1
docutils 0.21.2
exceptiongroup 1.2.2
executing 2.0.1
fastjsonschema 2.20.0
fonttools 4.53.1
idna 3.7
imagesize 1.4.1
importlib_metadata 8.2.0
importlib_resources 6.4.0
insipid-sphinx-theme 0.4.2
ipykernel 6.29.5
ipython 8.18.1
jedi 0.19.1
Jinja2 3.1.4
jsonschema 4.23.0
jsonschema-specifications 2023.12.1
jupyter_client 8.6.2
jupyter_core 5.7.2
jupyterlab_pygments 0.3.0
kiwisolver 1.4.5
latexcodec 3.0.0
MarkupSafe 2.1.5
matplotlib 3.9.1
matplotlib-inline 0.1.7
mistune 3.0.2
mpmath 1.3.0
nbclient 0.10.0
nbconvert 7.16.4
nbformat 5.10.4
nbsphinx 0.9.4
nest-asyncio 1.6.0
numpy 2.0.1
packaging 24.1
pandocfilters 1.5.1
parso 0.8.4
pillow 10.4.0
pip 24.0
platformdirs 4.2.2
prompt_toolkit 3.0.47
psutil 6.0.0
pure_eval 0.2.3
pybtex 0.24.0
pybtex-docutils 1.0.3
Pygments 2.18.0
pyparsing 3.1.2
python-dateutil 2.9.0.post0
pywin32 306
PyYAML 6.0.1
pyzmq 26.0.3
referencing 0.35.1
requests 2.32.3
rpds-py 0.19.1
scipy 1.13.1
setuptools 70.0.0
six 1.16.0
snowballstemmer 2.2.0
soupsieve 2.5
Sphinx 7.4.7
sphinx-codeautolink 0.15.2
sphinx-last-updated-by-git 0.3.7
sphinxcontrib-applehelp 1.0.8
sphinxcontrib-bibtex 2.6.2
sphinxcontrib-devhelp 1.0.6
sphinxcontrib-htmlhelp 2.0.6
sphinxcontrib-jsmath 1.0.1
sphinxcontrib-qthelp 1.0.8
sphinxcontrib-serializinghtml 1.1.10
splines 0.3.2
stack-data 0.6.3
sympy 1.13.1
tinycss2 1.3.0
tomli 2.0.1
tornado 6.4.1
traitlets 5.14.3
typing_extensions 4.12.2
urllib3 2.2.2
wcwidth 0.2.13
webencodings 0.5.1
wheel 0.43.0
zipp 3.19.2
I got an error in the compilation of the documentation at about 70%.
This should be enough to see whether it is reading (and executing) the notebooks in parallel.
When I run it with -j4
I can clearly see that 4 cores are maxing out.
There is also a change in the terminal output:
$ python -m sphinx doc _build
[...]
reading sources... [ 4%] euclidean/bezier-de-casteljau
$ python -m sphinx doc _build -j4
[...]
reading sources... [ 20%] euclidean/bezier .. euclidean/end-conditions-natural
Note that when reading in parallel, a range of notebooks is is shown instead of a single one.
If you want to try a project with fewer dependencies, you can try this: https://github.com/AudioSceneDescriptionFormat/asdf
$ python -m sphinx doc _build -j4
[...]
reading sources... [ 80%] seq-par .. splines
Sphinx parallel build error:
nbsphinx.NotebookError: CellExecutionError in quaternions.ipynb:
------------------
Quaternion.rotate_point((0, 1, 0), q_z.subs(alpha, sp.pi / 2))
[...]
Conveniently, this currently raises an error, which even mentions that it is doing a parallel build!
Hi, I tried it again.
Could you please send me your package versions, so that I can try it with these. Maybe there is some differen in the OS. Do you use windows or linux?
I also tried it again and I found out why your minimal example didn't work in parallel for me: apparently Sphinx only does parallel processing if there are more than 5 source files: https://github.com/sphinx-doc/sphinx/blob/d56cf30ecb2d68651c75b454f0aeae74304285dd/sphinx/builders/__init__.py#L431.
I have reported this surprising behavior in https://github.com/sphinx-doc/sphinx/pull/12796.
I guess your example still doesn't run in parallel when you add two more notebooks?
For me, it took 17 seconds.
I would like to check if this really is related to nbsphinx
... did you try your example without nbsphinx
?
I tried it with this example setup:
conf.py
:
import time
def source_read(app, docname, content):
time.sleep(10)
def setup(app):
app.connect('source-read', source_read)
index.rst
:
Test
====
.. toctree::
test1
test2
test3
test4
test5
test1.rst
to test5.rst
:
Test Page
=========
When running this, I get:
$ time python -m sphinx . _build -j6
Sphinx v8.1.0+/f1078bdfa [...]
[...]
real 0m10,993s
user 0m1,660s
sys 0m0,192s
Does that work for you?
Could you please send me your package versions, so that I can try it with these.
I was using the latest Git versions of Sphinx and SymPy, I guess.
Do you use windows or linux?
Linux
Hi I tested your example without nbsphinx, and in fact no multiprocessing is active. I Looked through github and found an old issue https://github.com/sphinx-doc/sphinx/issues/8296 stating that sphinx does not run parallel on windows. Further searching revealed that the parallel execution of sphinx does still only work on systems allowing "fork". This can be checked in the source code of sphinx under "sphinx/sphinx/util/parallel.py" (https://github.com/sphinx-doc/sphinx/tree/v8.0.2/sphinx/util):
# our parallel functionality only works for the forking Process
parallel_available = multiprocessing and os.name == 'posix'
precv, psend = multiprocessing.Pipe(False)
context: Any = multiprocessing.get_context('fork')
I already feared that this would be OS dependant. Interestingly, there is no remark in the documenation of Sphinx for that. This issue can be closed because it is not related to nbsphinx. Thanks for your time.
Thanks for tracking this down!
Interestingly, there is no remark in the documenation of Sphinx for that.
Would you like to create a PR at https://github.com/sphinx-doc/sphinx/pulls for this? I think this would be helpful.
Hello,
I have noticed that the parallel execution of jupyter notebooks does not work (for me). See below or use the zip (source.zip) for a minimal example of the problem. Executing
sphinx-build source build -j4 -b html
does not show any performance increase as compared tosphinx-build source build -j1 -b html
. Both take about 38 seconds on my machine. Each notebook waits for 10 sconds, thus there is no parallel execution of the notebooks. On my main project, .rst files are build parallel as expected but the notebooks always slowdown the build process.From the documentation https://nbsphinx.readthedocs.io/en/0.9.3/usage.html#Running-Sphinx I would expect that the parallelization works without much set up. Is this a bug or the expected behaviour?
I am using:
This is the complete list in the newly set up virtaul env after installing the above packages:
Setup of the minimal example
source -- conf.py -- index.rst -- test1.ipynb -- test2.ipynb -- test3.ipynb
The conf.py file:
The index.rst file:
The test1.ipynb, etc. files have two cells, one markdown, one python cell: