python / docsbuild-scripts

scripts for building documentation on docs.python.org
60 stars 55 forks source link

Full build with PDF is taking more than 24h #169

Open JulienPalard opened 9 months ago

JulienPalard commented 9 months ago

Today we're building 6 versions for 13 languages (that's 78 builds).

I'm doing some tests on my machine to get an idea:

So a complete rebuild should take ~27h (on my machine, the server may have a different CPU).

egeakman commented 9 months ago

Building only the changed branches and languages, how does it sound?

I don't have much information about the server but here are some ideas:

egeakman commented 9 months ago

Additional question for @JulienPalard: Are non-bugfix and non-stable releases only rebuilt manually or is there a cron?

JulienPalard commented 9 months ago

Are non-bugfix and non-stable releases only rebuilt manually or is there a cron?

EOL and security-fixes branches are only built manually, yes.

egeakman commented 9 months ago

What do you think about the idea in general @JulienPalard?

JulienPalard commented 9 months ago

I don't have much information about the server

(see the git_clone function in the script).

Building only the changed branches and languages, how does it sound?

Maybe, yes. But to do it cleanly we should only rebuild if the Doc/ directory changed, but doable and cheap as we keep the cpython clone.

Currently a git log shows that (for the Doc/ directory) changes are:

A change cpython side should trigger a rebuild of all languages for the given branch (while a change for a language should not trigger a rebuild for all branches: translation repo has a translation branch per cpython branch).

Translation side I expect less changes, except for a few crons like for python-docs-ja which synchronizes daily. Also changes are for a single branch so they invalidate a single cpython branch.

So instead of rebuilding 78 docs per day, just looking at the cpython side, we'd rebuild like 26 docs daily (main and 3.11) which should take like 9h.

About how to do this, we could store at build time the cpython commit sha and the translation commit sha in a file, and at build time call a dedicated function to check the shas against the repositories to see if a build is needed or not.

Seems doable.

picnixz commented 9 months ago

@JulienPalard Someone opened an issue on Sphinx in order to know whether we could speed-up things on our side. I am not quite confident in that since most of the build time is actually the PDF build. It is possible to know why make all-pdf is so slow? is it only because there are a lot of pages to write? or is it because of the underlying language? (iirc, jp builds don't build with plain pdflatex).

Another possibility is to change the way the PDF are created. We (Sphinx) could technically add a builder which, instead of outputting LaTeX code, outputs some other kind of typesetting language that can still be converted to PDF faster than LaTeX.

JulienPalard commented 9 months ago

Hi @picnixz, thanks for jumping in!

It is possible to know why make all-pdf is so slow?

I'm just a user of LaTeX, I don't know much. I know our biggest file is library.tex with 288k lines of LaTeX, generating a 11MB PDF file.

make all-pdf runs latexmk for each .tex files, which in turn runs xelatex. So I tried just running time latexmk -pdf -dvi- -ps- library.tex, it takes 1 min 16s (on the same laptop used for the timings in the first post). same timing for the underlying command: xelatex -recorder library.tex.

Those timings can be reproduced on a cpython clone after building latex file:

make -C Doc/build/latex/ clean
time make -C Doc/build/latex/ library.pdf  # 4 min 30s
time (cd Doc/build/latex; make clean; XINDYOPTS="-L english -C utf8 -M sphinx.xdy" latexmk -pdf -dvi- -ps- library.tex) # 4 min 13s
time (cd Doc/build/latex; xelatex -recorder library.tex`  # 1 min 16s

It reminds me of something about running it in a loop until the output does not changes..., which strace confirms:

$ grep bin/xelatex make.strace 
15622 1696969290.322763 execve("/usr/bin/xelatex", ["xelatex", "-recorder", "library.tex"], 0x55b2cfd2a5d8 /* 90 vars */) = 0
15719 1696969388.530816 execve("/usr/bin/xelatex", ["xelatex", "-recorder", "library.tex"], 0x5651c759b5d8 /* 90 vars */) = 0
15766 1696969471.969531 execve("/usr/bin/xelatex", ["xelatex", "-recorder", "library.tex"], 0x55ddc43a15d8 /* 90 vars */) = 0

Which also can be confirmed by reading the console log (I choose a smaller file to get readable output):

------------
Run number 1 of rule 'pdflatex'
------------
------------
Running 'xelatex   -recorder  "howto-cporting.tex"'
------------

[...]

Rule 'pdflatex':  Reasons for rerun
Changed files or newly in use/created:
  howto-cporting.aux
  howto-cporting.ind
  howto-cporting.out
  howto-cporting.toc

------------
Run number 2 of rule 'pdflatex'
------------
------------
Running 'xelatex   -recorder  "howto-cporting.tex"'
------------

Also noticed, from xelatex output:

LaTeX Warning: Label(s) may have changed. Rerun to get cross-references right.

Package rerunfilecheck Warning: File `library.out' has changed.
(rerunfilecheck)                Rerun to get outlines right
(rerunfilecheck)                or use package `bookmark'.

this is probably less than ideal.

I tried to use perf to get some insights about what's slow inside xetex, if I read this correctly that's the parsing of the input file that's slow, can be reproduced using:

sudo sysctl kernel.perf_event_paranoid=-1
sudo apt install texlive-binaries-dbgsym  # needs `deb https://deb.debian.org/debian-debug trixie-debug main`
time (cd Doc/build/latex/; make clean; perf record -F 1000 -g --call-graph dwarf -o perf.data xelatex -recorder library.tex)
(cd Doc/build/latex/; perf report --tui)

it looks like this:

-   80.59%     0.00%  xelatex    xetex                     [.] main                                               ▒
   - main                                                                                                         ▒
      - 80.54% mainbody                                                                                           ▒
         - 80.37% maincontrol                                                                                     ▒
            - 35.42% getxtoken                                                                                    ▒
               - 18.35% expand                                                                                    ▒
                  - 12.60% conditional                                                                            ▒
                     + 9.10% scanint                                                                              ▒
                     + 1.37% passtext                                                                             ▒
                  + 3.13% expand                                                                                  ▒
                    0.54% passtext                                                                                ▒
               - 14.03% macrocall                                                                                 ▒
                    6.20% getnext                                                                                 ▒
                    1.92% endtokenlist                                                                            ▒
               + 2.67% getnext                                                                                    ▒
            - 35.25% prefixedcommand                                                                              ▒
               - 29.63% zscantoks                                                                                 ▒
                  - 26.62% expand                                                                                 ▒
                     - 19.90% macrocall                                                                           ▒
                        - 10.50% getnext                                                                          ▒
                             3.33% endtokenlist                                                                   ▒
                          1.77% endtokenlist                                                                      ▒
                          0.63% getavail                                                                          ▒
                     + 5.68% conditional                                                                          ▒
                    1.87% getnext                                                                                 ▒
               + 3.12% zdoregistercommand                                                                         ▒
                 0.52% zeqdefine                                                                                  ▒
            + 3.87% measure_native_node                                                                           ▒
              0.86% unsave                                                                                        ▒
            + 0.60% zshipout                                                                                      ▒
            + 0.58% endgraf.part.0                                                                                ▒
              0.51% zbeginbox                                                                                     ▒

is it only because there are a lot of pages to write?

Probably yes, library.pdf is 2318 pages, or lots of latex to parse if my perf reading is good.

or is it because of the underlying language? (iirc, jp builds don't build with plain pdflatex).

Looking at the server logs, yes jp is slower than all the others. It takes like 1h30 while the other take like 30mn (for a full build of text, html, pdfs, ...). All builds are using xelatex while jp builds are using lualatex. As it's the only one I won't focus on this one.

Another possibility is to change the way the PDF are created. We (Sphinx) could technically add a builder which, instead of outputting LaTeX code, outputs some other kind of typesetting language that can still be converted to PDF faster than LaTeX.

It would be possible to build PDF using weasyprint.

m-aciek commented 9 months ago

It looks like a potential alternative for PDFs building is using Rinohtype. It provides a drop-in replacement for Sphinx PDF builder and is a direct PDF builder. Still in beta though.

picnixz commented 9 months ago

Thank you very much for your report !

Now I'm a bit confused about these timings:

time make -C Doc/build/latex/ library.pdf # 4 min 30s time (cd Doc/build/latex; make clean; XINDYOPTS="-L english -C utf8 -M sphinx.xdy" latexmk -pdf -dvi- -ps- library.tex) # 4 min 13s time (cd Doc/build/html; xelatex -recorder library.tex` # 1 min 16s

Maybe I misunderstood, but why does the second command takes 4 min 13s but you told me it took 1min 16s? I also think the last command should be cd Doc/build/latex instead of cd Doc/build/html.

if I read this correctly that's the parsing of the input file that's slow

Yes, it's because of how LaTeX is structured with this token based approach + expansions. Depending on how expansions are done, it may enormously slow down the build. I am wondering whether breaking down library.tex into smaller pieces that are put together at the end would improve or not the timings.


In conclusion, I don't think we could do much except:

By the way, how much is allocated to build the documentation on Python servers? because a straightforward solution is simply to allocate more resources and run everything in parallel (but again, funds are required and I don't know if the server is like a super-mega-huge-powerful server). In the end, if you need to rebuild 26 docs per day (assuming you can reduce from 70+), the you "simply" need a more powerful unit.

JulienPalard commented 9 months ago

Now I'm a bit confused about these timings:

time make -C Doc/build/latex/ library.pdf # 4 min 30s time (cd Doc/build/latex; make clean; XINDYOPTS="-L english -C utf8 -M sphinx.xdy" latexmk -pdf -dvi- -ps- library.tex) # 4 min 13s time (cd Doc/build/html; xelatex -recorder library.tex` # 1 min 16s

Maybe I misunderstood, but why does the second command takes 4 min 13s but you told me it took 1min 16s?

That's because latexmk runs xelatex in a while True: loop until the output stabilizes. Looks like it had to run it 4 times to reach stabilization (I have a few paragraph in my last message about it, search for 'strace' and 'rerun').

I really don't know if there's a way to forge a latex file that necessitates less re-runs.

I also think the last command should be cd Doc/build/latex instead of cd Doc/build/html.

Probably just me manually fixing the commands for readability (and breaking them while doing so, haha).

picnixz commented 9 months ago

That's because latexmk runs xelatex in a while True: loop until the output stabilizes

Ah sorry! yes I overlooked that (I was a bit confused actually because I assumed that the ~1min was the output of time). Now, the question is: is it possible to avoid using xelatex actually and only pdflatex?

Also, since it tells us "or use package bookmark", maybe this could solve the issue (though I don't know how). Nevertheless, yet another alternative is to run the latex command only once and check whether more compilation is needed (and not let latexmk decides by itself). I'm not sure whether the 4 runs are actually needed to solve all the references (in general, we need 2 reruns but here I'm wondering why we actually need 4).

Btw, I'm sorry but I cannot really reproduce it myself because I need to install fonts that I don't have (and move them around files + adding paths or so) (but since you've got everything running on your side + timings are for your machine, the comparison is more fair).

JulienPalard commented 9 months ago

IIRC we use xetex instead of pdflatex for its unicode awareness.

Tried:


$ make -C build_root/cpython/Doc PYTHON=build_root/venv-3.11/bin/python SPHINXBUILD=build_root/venv-3.11/bin/sphinx-build BLURB=build_root/venv-3.11/bin/blurb VENVDIR=build_root/venv-3.11 'SPHINXOPTS=-D latex_engine=pdflatex -q' SPHINXERRORHANDLING= autobuild-stable

[...]

LaTeX Warning: Hyper reference `howto/regex:the-backslash-plague' on page 6 und
efined on input line 816.

[6]

! LaTeX Error: Unicode character ſ (U+017F)
               not set up for use with LaTeX.

See the LaTeX manual or LaTeX Companion for explanation.
Type  H <return>  for immediate help.
 ...                                              

l.962 Latin small letter dotless i), ‘ſ
                                          ’ (U+017F, Latin small letter lo...

? 
JulienPalard commented 8 months ago

Since https://github.com/python/docsbuild-scripts/pull/171 has been merged, there has been a build taking 13.6 hours. That's better, probably still room for enhancements.

JulienPalard commented 8 months ago

I changed the build cron so it starts hourly, so instead of doing nothing for 24h-13.6h the script checks if there's something to build.

In the logs here's what I see:

2023-10-27 04:56:30,401 INFO en/3.12: Nothing changed, no rebuild needed.
2023-10-27 04:56:31,868 INFO zh-tw/3.13: Should rebuild: new translations (from a4719a1e1605163e886fad5c3783bdac250f0db9 to f24fd11929f41176367f42559c62b808e7468b2d)
2023-10-27 06:15:32,386 INFO zh-cn/3.13: Should rebuild: new translations (from 99da58558f6c81e2808bbe85307f5d480c9ec096 to ef8a1e2bd3a47f9a5d21c7f6495d3f525f3c66e6)
2023-10-27 07:30:51,890 INFO uk/3.13: Nothing changed, no rebuild needed.
2023-10-27 07:30:52,196 INFO tr/3.13: Nothing changed, no rebuild needed.
2023-10-27 07:30:52,494 INFO pt-br/3.13: Nothing changed, no rebuild needed.
2023-10-27 07:30:52,804 INFO pl/3.13: Nothing changed, no rebuild needed.
2023-10-27 07:30:53,072 INFO ko/3.13: Nothing changed, no rebuild needed.
2023-10-27 07:30:53,989 INFO ja/3.13: Should rebuild: new translations (from 32e85a08b356b6faa3c5784f1d5bdea15ec2e4f4 to 260a16dd7cb834bcffe2bee64375f6e8d43f1b35)
2023-10-27 08:51:27,928 INFO it/3.13: Nothing changed, no rebuild needed.
2023-10-27 08:51:28,353 INFO id/3.13: Nothing changed, no rebuild needed.
2023-10-27 08:51:28,685 INFO fr/3.13: Nothing changed, no rebuild needed.
2023-10-27 08:51:29,262 INFO es/3.13: Nothing changed, no rebuild needed.
2023-10-27 08:51:29,363 INFO en/3.13: Nothing changed, no rebuild needed.
2023-10-27 09:07:07,306 INFO zh-tw/3.11: Nothing changed, no rebuild needed.
2023-10-27 09:07:08,020 INFO zh-cn/3.11: Nothing changed, no rebuild needed.
2023-10-27 09:07:08,501 INFO uk/3.11: Nothing changed, no rebuild needed.
2023-10-27 09:07:08,992 INFO tr/3.11: Nothing changed, no rebuild needed.
2023-10-27 09:07:09,540 INFO pt-br/3.11: Nothing changed, no rebuild needed.
2023-10-27 09:07:09,947 INFO pl/3.11: Nothing changed, no rebuild needed.
2023-10-27 09:07:10,255 INFO ko/3.11: Nothing changed, no rebuild needed.
2023-10-27 09:07:11,250 INFO ja/3.11: Nothing changed, no rebuild needed.
2023-10-27 09:07:11,512 INFO it/3.11: Nothing changed, no rebuild needed.
2023-10-27 09:07:11,767 INFO id/3.11: Nothing changed, no rebuild needed.
2023-10-27 09:07:12,113 INFO fr/3.11: Nothing changed, no rebuild needed.
2023-10-27 09:07:12,820 INFO es/3.11: Nothing changed, no rebuild needed.
2023-10-27 09:07:12,908 INFO en/3.11: Nothing changed, no rebuild needed.
2023-10-27 09:07:15,611 INFO zh-tw/3.12: Should rebuild: new translations (from a4719a1e1605163e886fad5c3783bdac250f0db9 to eaccd60ee755daee77d4fe24707c1a2ec6059bfb)
2023-10-27 09:49:24,057 INFO zh-cn/3.12: Nothing changed, no rebuild needed.
2023-10-27 09:49:24,383 INFO uk/3.12: Nothing changed, no rebuild needed.
2023-10-27 09:49:24,759 INFO tr/3.12: Nothing changed, no rebuild needed.
2023-10-27 09:49:25,169 INFO pt-br/3.12: Nothing changed, no rebuild needed.
2023-10-27 09:49:25,541 INFO pl/3.12: Nothing changed, no rebuild needed.
2023-10-27 09:49:25,820 INFO ko/3.12: Nothing changed, no rebuild needed.
2023-10-27 09:49:26,652 INFO ja/3.12: Nothing changed, no rebuild needed.
2023-10-27 09:49:26,920 INFO it/3.12: Nothing changed, no rebuild needed.
2023-10-27 09:49:27,175 INFO id/3.12: Nothing changed, no rebuild needed.
2023-10-27 09:49:27,500 INFO fr/3.12: Nothing changed, no rebuild needed.
2023-10-27 09:49:28,223 INFO es/3.12: Nothing changed, no rebuild needed.
2023-10-27 09:49:28,313 INFO en/3.12: Nothing changed, no rebuild needed.
2023-10-27 09:49:31,085 INFO zh-tw/3.13: Should rebuild: new translations (from f24fd11929f41176367f42559c62b808e7468b2d to eaccd60ee755daee77d4fe24707c1a2ec6059bfb)
2023-10-27 11:08:41,249 INFO zh-cn/3.13: Should rebuild: Doc/ has changed (from a254120f2f1dd99fa64f12594d1ed19c67df7d64 to 74f0772892c85b6e7bdfa0f44a5ff89002b0734d)
2023-10-27 12:31:52,133 INFO uk/3.13: Should rebuild: Doc/ has changed (from 3f84a19e6291db682fc9a570e7612e80e2ffbbb5 to 74f0772892c85b6e7bdfa0f44a5ff89002b0734d)
2023-10-27 12:35:32,533 INFO tr/3.13: Should rebuild: Doc/ has changed (from 3f84a19e6291db682fc9a570e7612e80e2ffbbb5 to 74f0772892c85b6e7bdfa0f44a5ff89002b0734d)
2023-10-27 13:08:20,924 INFO pt-br/3.13: Should rebuild: Doc/ has changed (from 3f84a19e6291db682fc9a570e7612e80e2ffbbb5 to 74f0772892c85b6e7bdfa0f44a5ff89002b0734d)
hugovk commented 7 months ago

The last https://docs.python.org/3/ build was at Dec 07, 2023 (12:24 UTC).

The 3.12.1 release was at something like Dec 08, 2023 (00:45 UTC).

As there's no public logs yet (https://github.com/python/docsbuild-scripts/issues/174), I'm curious why there's been no build yet from the hourly cron. Is the queue full, or maybe something else up?

This relates to https://discuss.python.org/t/python-3-12-1-now-available/40603/2?u=hugovk: a request for release announcements to include the changelog, but the changelog at https://docs.python.org/3/whatsnew/changelog.html still shows "Next" and not "3.12.1".

willingc commented 7 months ago

@hugovk @JulienPalard What are the next steps needed for this?

JulienPalard commented 7 months ago

I see the last build on Last updated on Dec 11, 2023 (04:33 UTC), so the build is running, still slow though.

hugovk commented 7 months ago

It now says Last updated on Dec 12, 2023 (08:00 UTC) on https://docs.python.org/3/

If it took 27.5 hours, what language/versions has it been building to delay it?

There have been changes to 3.11-3.13 in the past day, but a week or more for 3.10 and older.

JulienPalard commented 7 months ago

Since we stopped rebuilding when it's not needed, we're maintaining a file with some infos:

see file ```text ["/zh-tw/3.11/"] cpython_sha = "694631ca14255b749c565042b0d891aaf04d40c0" translation_sha = "1e356ff418870ee2ae86282e3e11631658d47c68" last_build = 2023-12-13T20:58:43.394089Z last_build_duration = 3098.094585299492 ["/zh-cn/3.11/"] cpython_sha = "f78f6b6e89b9816d0a8de4bc7eaa01edb93e4b80" translation_sha = "d7821cb1251d4e449112ab0a4fbe23030113f555" last_build = 2023-12-12T20:52:34.953898Z last_build_duration = 3103.580232806504 ["/uk/3.11/"] cpython_sha = "f78f6b6e89b9816d0a8de4bc7eaa01edb93e4b80" translation_sha = "83cc16b77a8dd606ef93af89edd45d8f4dd4ac7a" last_build = 2023-12-12T20:56:05.815128Z last_build_duration = 210.76982203125954 ["/tr/3.11/"] cpython_sha = "f78f6b6e89b9816d0a8de4bc7eaa01edb93e4b80" translation_sha = "a0aef1684ec26d77c1cfc253fada24e42ae7170c" last_build = 2023-12-12T21:25:28.430790Z last_build_duration = 1762.5443530604243 ["/pt-br/3.11/"] cpython_sha = "f78f6b6e89b9816d0a8de4bc7eaa01edb93e4b80" translation_sha = "93b01be79fa6d44a4500583459e7734fa16c7edc" last_build = 2023-12-12T21:54:16.924618Z last_build_duration = 1728.4081133455038 ["/pl/3.11/"] cpython_sha = "f78f6b6e89b9816d0a8de4bc7eaa01edb93e4b80" translation_sha = "ad3f03c2da957270a4524adfdf47620494cfd20d" last_build = 2023-12-12T22:20:45.157714Z last_build_duration = 1588.139028944075 ["/ko/3.11/"] cpython_sha = "f78f6b6e89b9816d0a8de4bc7eaa01edb93e4b80" translation_sha = "dada54826f66869e7c0f58193da81c99cd6fc3c6" last_build = 2023-12-12T22:56:47.261816Z last_build_duration = 2162.023073770106 ["/ja/3.11/"] cpython_sha = "f78f6b6e89b9816d0a8de4bc7eaa01edb93e4b80" translation_sha = "841afcc48089933831e6b5f910b61f042b41f5ce" last_build = 2023-12-13T00:23:37.409970Z last_build_duration = 5210.0655683502555 ["/it/3.11/"] cpython_sha = "f78f6b6e89b9816d0a8de4bc7eaa01edb93e4b80" translation_sha = "aebe05830ccf79e5a1261f5c31ce50872295144e" last_build = 2023-12-13T00:48:17.921699Z last_build_duration = 1480.389074318111 ["/id/3.11/"] cpython_sha = "f78f6b6e89b9816d0a8de4bc7eaa01edb93e4b80" translation_sha = "9ba591b54e465b7d75b20f26863f970e463e46e8" last_build = 2023-12-13T01:19:23.170320Z last_build_duration = 1865.1707886829972 ["/fr/3.11/"] cpython_sha = "f78f6b6e89b9816d0a8de4bc7eaa01edb93e4b80" translation_sha = "9ebdd501aca1c528907db8c40460271e4983722f" last_build = 2023-12-13T01:46:38.513918Z last_build_duration = 1635.2293921038508 ["/es/3.11/"] cpython_sha = "f78f6b6e89b9816d0a8de4bc7eaa01edb93e4b80" translation_sha = "169b71e2847a5c679b5e08b213c2581fa0035bb7" last_build = 2023-12-13T02:16:37.559038Z last_build_duration = 1798.9707588106394 ["/en/3.11/"] cpython_sha = "f78f6b6e89b9816d0a8de4bc7eaa01edb93e4b80" last_build = 2023-12-13T02:40:36.184253Z last_build_duration = 1438.521405339241 ["/zh-tw/3.12/"] cpython_sha = "b884d21f10f228bef11980e076af53343e686463" translation_sha = "3d1b20dc3008195f1cc644ad582948199ccd6e58" last_build = 2023-12-13T03:30:18.147043Z last_build_duration = 2981.8536808416247 ["/zh-cn/3.12/"] cpython_sha = "b884d21f10f228bef11980e076af53343e686463" translation_sha = "75ad56739df6d3e9bff7307c2a289dc5f7c88b01" last_build = 2023-12-13T04:18:34.099284Z last_build_duration = 2895.865228138864 ["/uk/3.12/"] cpython_sha = "b884d21f10f228bef11980e076af53343e686463" translation_sha = "e3beb58e3e443dda3741d38b4a041aa157ae70cb" last_build = 2023-12-13T04:22:04.852000Z last_build_duration = 210.6848026663065 ["/tr/3.12/"] cpython_sha = "b884d21f10f228bef11980e076af53343e686463" translation_sha = "1e1cabc2c897decad6f3113d9172a14b2faa7c3d" last_build = 2023-12-13T04:50:26.476017Z last_build_duration = 1701.536269709468 ["/pt-br/3.12/"] cpython_sha = "b884d21f10f228bef11980e076af53343e686463" translation_sha = "ea139bb4814d28057f7b62d767ff2fc33ca2393c" last_build = 2023-12-13T05:18:19.731717Z last_build_duration = 1673.1806314513087 ["/pl/3.12/"] cpython_sha = "b884d21f10f228bef11980e076af53343e686463" translation_sha = "fb5d3ef124b497a0abe87227d8e7fd7c5f2ef7ea" last_build = 2023-12-13T05:43:32.126013Z last_build_duration = 1512.2927004247904 ["/ko/3.12/"] cpython_sha = "b884d21f10f228bef11980e076af53343e686463" translation_sha = "dada54826f66869e7c0f58193da81c99cd6fc3c6" last_build = 2023-12-13T06:18:38.229082Z last_build_duration = 2106.0200723856688 ["/ja/3.12/"] cpython_sha = "b884d21f10f228bef11980e076af53343e686463" translation_sha = "97ff29fa511d414981898658199e4d9e7afeb45d" last_build = 2023-12-13T07:42:33.870205Z last_build_duration = 5035.5601375103 ["/it/3.12/"] cpython_sha = "b884d21f10f228bef11980e076af53343e686463" translation_sha = "aebe05830ccf79e5a1261f5c31ce50872295144e" last_build = 2023-12-13T08:08:05.896664Z last_build_duration = 1531.9528702422976 ["/id/3.12/"] cpython_sha = "b884d21f10f228bef11980e076af53343e686463" translation_sha = "9ba591b54e465b7d75b20f26863f970e463e46e8" last_build = 2023-12-13T08:41:17.613854Z last_build_duration = 1991.6177273616195 ["/fr/3.12/"] cpython_sha = "b884d21f10f228bef11980e076af53343e686463" translation_sha = "9ebdd501aca1c528907db8c40460271e4983722f" last_build = 2023-12-13T09:08:47.270563Z last_build_duration = 1649.590953759849 ["/es/3.12/"] cpython_sha = "b884d21f10f228bef11980e076af53343e686463" translation_sha = "23991c3d509c8cb223f1a8dd262988d844e42f34" last_build = 2023-12-13T09:39:05.043432Z last_build_duration = 1817.7011435106397 ["/en/3.12/"] cpython_sha = "b884d21f10f228bef11980e076af53343e686463" last_build = 2023-12-13T10:03:40.042323Z last_build_duration = 1474.9010257422924 ["/zh-tw/3.13/"] cpython_sha = "956023826a393b5704d3414dcd01f1bcbeaeda15" translation_sha = "644b84e9dc6fb79c79e3ca1c5015fd21da9c6f62" last_build = 2023-12-13T11:30:50.976827Z last_build_duration = 5230.8338821306825 ["/zh-cn/3.13/"] cpython_sha = "956023826a393b5704d3414dcd01f1bcbeaeda15" translation_sha = "75ad56739df6d3e9bff7307c2a289dc5f7c88b01" last_build = 2023-12-13T13:02:59.424127Z last_build_duration = 5528.383116334677 ["/uk/3.13/"] cpython_sha = "956023826a393b5704d3414dcd01f1bcbeaeda15" translation_sha = "e3beb58e3e443dda3741d38b4a041aa157ae70cb" last_build = 2023-12-13T13:06:51.197718Z last_build_duration = 231.70681616663933 ["/tr/3.13/"] cpython_sha = "956023826a393b5704d3414dcd01f1bcbeaeda15" translation_sha = "1e1cabc2c897decad6f3113d9172a14b2faa7c3d" last_build = 2023-12-13T13:40:24.260783Z last_build_duration = 2012.9675337076187 ["/pt-br/3.13/"] cpython_sha = "956023826a393b5704d3414dcd01f1bcbeaeda15" translation_sha = "ea139bb4814d28057f7b62d767ff2fc33ca2393c" last_build = 2023-12-13T14:13:29.505903Z last_build_duration = 1985.1673073396087 ["/pl/3.13/"] cpython_sha = "956023826a393b5704d3414dcd01f1bcbeaeda15" translation_sha = "fb5d3ef124b497a0abe87227d8e7fd7c5f2ef7ea" last_build = 2023-12-13T14:44:47.217278Z last_build_duration = 1877.6119738593698 ["/ko/3.13/"] cpython_sha = "956023826a393b5704d3414dcd01f1bcbeaeda15" translation_sha = "dada54826f66869e7c0f58193da81c99cd6fc3c6" last_build = 2023-12-13T15:24:54.988656Z last_build_duration = 2407.6938998550177 ["/ja/3.13/"] cpython_sha = "956023826a393b5704d3414dcd01f1bcbeaeda15" translation_sha = "97ff29fa511d414981898658199e4d9e7afeb45d" last_build = 2023-12-13T17:02:30.116978Z last_build_duration = 5855.0243775472045 ["/it/3.13/"] cpython_sha = "956023826a393b5704d3414dcd01f1bcbeaeda15" translation_sha = "aebe05830ccf79e5a1261f5c31ce50872295144e" last_build = 2023-12-13T17:34:32.321583Z last_build_duration = 1922.1185678616166 ["/id/3.13/"] cpython_sha = "956023826a393b5704d3414dcd01f1bcbeaeda15" translation_sha = "9ba591b54e465b7d75b20f26863f970e463e46e8" last_build = 2023-12-13T18:17:12.805072Z last_build_duration = 2560.38579890877 ["/fr/3.13/"] cpython_sha = "956023826a393b5704d3414dcd01f1bcbeaeda15" translation_sha = "9ebdd501aca1c528907db8c40460271e4983722f" last_build = 2023-12-13T18:50:52.892841Z last_build_duration = 2019.9838555455208 ["/es/3.13/"] cpython_sha = "956023826a393b5704d3414dcd01f1bcbeaeda15" translation_sha = "23991c3d509c8cb223f1a8dd262988d844e42f34" last_build = 2023-12-13T19:28:22.535989Z last_build_duration = 2249.569005407393 ["/en/3.13/"] cpython_sha = "956023826a393b5704d3414dcd01f1bcbeaeda15" last_build = 2023-12-13T19:59:47.479534Z last_build_duration = 1884.880883552134 ["/zh-tw/3.10/"] cpython_sha = "6c2f34fa77f884bd79801a9ab8a117cab7d9c7ed" translation_sha = "33d6fb4102d6565764a3c364a07de51d112b44d2" last_build = 2023-11-14T09:03:40.183755Z last_build_duration = 3398.2279169410467 ["/zh-cn/3.10/"] cpython_sha = "6c2f34fa77f884bd79801a9ab8a117cab7d9c7ed" translation_sha = "fb1a27f77113140559d5e595f2d80d9598c7231f" last_build = 2023-11-14T09:56:25.654707Z last_build_duration = 3165.389776684344 ["/uk/3.10/"] cpython_sha = "6c2f34fa77f884bd79801a9ab8a117cab7d9c7ed" translation_sha = "a91fd1b5d1dcad4a679f29da9dc9ddcba87545ba" last_build = 2023-11-14T09:59:58.765924Z last_build_duration = 212.99671158194542 ["/tr/3.10/"] cpython_sha = "6c2f34fa77f884bd79801a9ab8a117cab7d9c7ed" translation_sha = "7c7c3b0349a1238a8c72da8f0f5de7cf68527833" last_build = 2023-11-14T10:31:35.691285Z last_build_duration = 1896.8465352505445 ["/pt-br/3.10/"] cpython_sha = "6c2f34fa77f884bd79801a9ab8a117cab7d9c7ed" translation_sha = "948372e472c6dc03a2bc8c82f7d8fbe3309b272b" last_build = 2023-11-14T11:01:05.771282Z last_build_duration = 1770.0178842619061 ["/pl/3.10/"] cpython_sha = "6c2f34fa77f884bd79801a9ab8a117cab7d9c7ed" translation_sha = "b823635d9d1a216dccbbadef44ab282987f31283" last_build = 2023-11-14T11:27:19.901667Z last_build_duration = 1574.0320699512959 ["/ko/3.10/"] cpython_sha = "6c2f34fa77f884bd79801a9ab8a117cab7d9c7ed" translation_sha = "dada54826f66869e7c0f58193da81c99cd6fc3c6" last_build = 2023-11-14T12:05:27.032940Z last_build_duration = 2287.0581825375557 ["/ja/3.10/"] cpython_sha = "6c2f34fa77f884bd79801a9ab8a117cab7d9c7ed" translation_sha = "adde31e5584af8266413d1997324c44c5a0bc865" last_build = 2023-11-14T13:39:15.475070Z last_build_duration = 5628.3420650959015 ["/it/3.10/"] cpython_sha = "6c2f34fa77f884bd79801a9ab8a117cab7d9c7ed" translation_sha = "aebe05830ccf79e5a1261f5c31ce50872295144e" last_build = 2023-11-14T14:06:35.699216Z last_build_duration = 1640.0956600159407 ["/id/3.10/"] cpython_sha = "6c2f34fa77f884bd79801a9ab8a117cab7d9c7ed" translation_sha = "9ba591b54e465b7d75b20f26863f970e463e46e8" last_build = 2023-11-14T14:39:29.752166Z last_build_duration = 1973.9433364644647 ["/fr/3.10/"] cpython_sha = "6c2f34fa77f884bd79801a9ab8a117cab7d9c7ed" translation_sha = "fba25673317c237be4e95709a52203c2564375d2" last_build = 2023-11-14T15:08:32.529183Z last_build_duration = 1742.671448096633 ["/es/3.10/"] cpython_sha = "6c2f34fa77f884bd79801a9ab8a117cab7d9c7ed" translation_sha = "e37b282476c5f9261c0ca2ac7e8db6e6f6f9bcd7" last_build = 2023-11-14T15:47:58.007060Z last_build_duration = 2365.3818720132113 ["/en/3.10/"] cpython_sha = "6c2f34fa77f884bd79801a9ab8a117cab7d9c7ed" last_build = 2023-11-14T16:12:52.351156Z last_build_duration = 1494.2587717548013 ```

it's not that readable but may help a bit.

I can paste some logs here:

view file ```txt mdk@docs:~$ zgrep 'Build start' /var/log/docsbuild/docsbuild.log.6.gz 2023-12-07 01:03:31,783 INFO it/3.11: Build start. 2023-12-07 01:34:48,312 INFO id/3.11: Build start. 2023-12-07 02:13:16,175 INFO fr/3.11: Build start. 2023-12-07 02:47:08,767 INFO es/3.11: Build start. 2023-12-07 03:24:20,181 INFO en/3.11: Build start. 2023-12-07 03:53:32,721 INFO zh-tw/3.12: Build start. 2023-12-07 04:54:28,702 INFO zh-cn/3.12: Build start. 2023-12-07 05:51:22,955 INFO uk/3.12: Build start. 2023-12-07 05:55:15,983 INFO tr/3.12: Build start. 2023-12-07 06:29:04,896 INFO pt-br/3.12: Build start. 2023-12-07 07:03:29,075 INFO pl/3.12: Build start. 2023-12-07 07:33:25,766 INFO ko/3.12: Build start. 2023-12-07 08:16:09,363 INFO ja/3.12: Build start. 2023-12-07 09:59:31,894 INFO it/3.12: Build start. 2023-12-07 10:29:39,325 INFO id/3.12: Build start. 2023-12-07 11:10:44,153 INFO fr/3.12: Build start. 2023-12-07 11:45:54,247 INFO es/3.12: Build start. 2023-12-07 12:24:23,271 INFO en/3.12: Build start. <-- the build you've seen in prod while writing your message 2023-12-07 12:55:57,594 INFO zh-tw/3.13: Build start. 2023-12-07 14:58:05,206 INFO zh-cn/3.13: Build start. 2023-12-07 16:59:55,851 INFO uk/3.13: Build start. 2023-12-07 17:06:09,502 INFO tr/3.13: Build start. 2023-12-07 17:53:58,847 INFO pt-br/3.13: Build start. 2023-12-07 18:40:23,044 INFO pl/3.13: Build start. 2023-12-07 19:22:33,254 INFO ko/3.13: Build start. 2023-12-07 20:17:45,832 INFO ja/3.13: Build start. 2023-12-07 22:24:23,323 INFO it/3.13: Build start. 2023-12-07 23:03:59,538 INFO id/3.13: Build start. 2023-12-07 23:50:30,074 INFO fr/3.13: Build start. mdk@docs:~$ zgrep 'Build start' /var/log/docsbuild/docsbuild.log.5.gz 2023-12-08 00:29:10,617 INFO es/3.13: Build start. <-- 3.12.1 release 2023-12-08 01:10:12,533 INFO en/3.13: Build start. 2023-12-08 02:07:14,771 INFO zh-tw/3.11: Build start. 2023-12-08 03:02:35,382 INFO zh-cn/3.11: Build start. 2023-12-08 03:55:25,481 INFO uk/3.11: Build start. 2023-12-08 03:58:58,984 INFO tr/3.11: Build start. 2023-12-08 04:30:39,399 INFO pt-br/3.11: Build start. 2023-12-08 05:02:31,596 INFO pl/3.11: Build start. 2023-12-08 05:31:15,058 INFO ko/3.11: Build start. 2023-12-08 06:12:14,383 INFO ja/3.11: Build start. 2023-12-08 07:51:32,897 INFO it/3.11: Build start. 2023-12-08 08:19:59,311 INFO id/3.11: Build start. <-- your message (more or less 1h) 2023-12-08 08:55:57,314 INFO fr/3.11: Build start. 2023-12-08 09:26:26,468 INFO es/3.11: Build start. 2023-12-08 10:01:36,727 INFO en/3.11: Build start. 2023-12-08 10:29:12,557 INFO zh-tw/3.12: Build start. 2023-12-08 11:28:08,345 INFO zh-cn/3.12: Build start. 2023-12-08 12:28:27,897 INFO uk/3.12: Build start. 2023-12-08 12:33:01,923 INFO tr/3.12: Build start. 2023-12-08 13:09:30,960 INFO pt-br/3.12: Build start. 2023-12-08 13:45:39,082 INFO pl/3.12: Build start. 2023-12-08 14:17:58,792 INFO ko/3.12: Build start. 2023-12-08 15:02:13,565 INFO ja/3.12: Build start. 2023-12-08 16:49:28,995 INFO it/3.12: Build start. 2023-12-08 17:24:15,640 INFO id/3.12: Build start. 2023-12-08 18:03:34,328 INFO fr/3.12: Build start. 2023-12-08 18:35:41,454 INFO es/3.12: Build start. 2023-12-08 19:11:52,848 INFO en/3.12: Build start. <-- 2023-12-08 19:41:12,356 INFO zh-tw/3.13: Build start. 2023-12-08 21:28:18,628 INFO zh-cn/3.13: Build start. 2023-12-08 23:08:18,807 INFO uk/3.13: Build start. 2023-12-08 23:12:37,137 INFO tr/3.13: Build start. 2023-12-08 23:48:46,069 INFO pt-br/3.13: Build start. mdk@docs:~$ zgrep 'Build start' /var/log/docsbuild/docsbuild.log.4.gz 2023-12-09 00:25:10,138 INFO pl/3.13: Build start. 2023-12-09 00:57:45,725 INFO ko/3.13: Build start. 2023-12-09 01:40:28,952 INFO ja/3.13: Build start. 2023-12-09 03:21:11,244 INFO it/3.13: Build start. 2023-12-09 03:53:36,108 INFO id/3.13: Build start. 2023-12-09 04:35:39,054 INFO fr/3.13: Build start. 2023-12-09 05:08:42,237 INFO es/3.13: Build start. 2023-12-09 05:44:38,758 INFO en/3.13: Build start. 2023-12-09 07:07:18,041 INFO zh-tw/3.11: Build start. 2023-12-09 07:58:36,484 INFO zh-cn/3.11: Build start. 2023-12-09 08:50:50,170 INFO uk/3.11: Build start. 2023-12-09 08:54:18,539 INFO tr/3.11: Build start. 2023-12-09 09:25:13,308 INFO pt-br/3.11: Build start. 2023-12-09 09:55:38,752 INFO pl/3.11: Build start. 2023-12-09 10:24:47,533 INFO ko/3.11: Build start. 2023-12-09 11:05:06,125 INFO ja/3.11: Build start. 2023-12-09 12:47:05,609 INFO it/3.11: Build start. 2023-12-09 13:17:31,303 INFO id/3.11: Build start. 2023-12-09 13:55:56,164 INFO fr/3.11: Build start. 2023-12-09 14:30:28,217 INFO es/3.11: Build start. 2023-12-09 15:10:10,126 INFO en/3.11: Build start. 2023-12-09 15:40:26,584 INFO zh-tw/3.12: Build start. 2023-12-09 16:42:34,264 INFO zh-cn/3.12: Build start. 2023-12-09 17:43:02,356 INFO uk/3.12: Build start. 2023-12-09 17:47:08,059 INFO tr/3.12: Build start. 2023-12-09 18:21:25,707 INFO pt-br/3.12: Build start. 2023-12-09 18:55:27,545 INFO pl/3.12: Build start. 2023-12-09 19:26:25,268 INFO ko/3.12: Build start. 2023-12-09 20:08:18,771 INFO ja/3.12: Build start. 2023-12-09 21:51:05,595 INFO it/3.12: Build start. 2023-12-09 22:23:04,710 INFO id/3.12: Build start. 2023-12-09 23:03:02,551 INFO fr/3.12: Build start. 2023-12-09 23:36:46,483 INFO es/3.12: Build start. mdk@docs:~$ zgrep 'Build start' /var/log/docsbuild/docsbuild.log.3.gz 2023-12-10 00:12:31,758 INFO en/3.12: Build start. <-- 2023-12-10 00:38:36,561 INFO zh-tw/3.13: Build start. 2023-12-10 02:11:45,282 INFO zh-cn/3.13: Build start. 2023-12-10 03:40:09,197 INFO uk/3.13: Build start. 2023-12-10 03:43:59,909 INFO tr/3.13: Build start. 2023-12-10 04:16:53,655 INFO pt-br/3.13: Build start. 2023-12-10 04:50:19,080 INFO pl/3.13: Build start. 2023-12-10 05:21:18,370 INFO ko/3.13: Build start. 2023-12-10 05:59:52,249 INFO ja/3.13: Build start. 2023-12-10 07:39:20,008 INFO it/3.13: Build start. 2023-12-10 08:12:51,583 INFO id/3.13: Build start. 2023-12-10 08:56:53,066 INFO fr/3.13: Build start. 2023-12-10 09:32:21,252 INFO es/3.13: Build start. 2023-12-10 10:10:41,943 INFO en/3.13: Build start. 2023-12-10 11:07:17,607 INFO zh-tw/3.11: Build start. 2023-12-10 12:01:43,511 INFO zh-cn/3.11: Build start. 2023-12-10 12:55:46,836 INFO uk/3.11: Build start. 2023-12-10 12:59:14,477 INFO tr/3.11: Build start. 2023-12-10 13:31:47,797 INFO pt-br/3.11: Build start. 2023-12-10 14:04:52,957 INFO pl/3.11: Build start. 2023-12-10 14:35:04,605 INFO ko/3.11: Build start. 2023-12-10 15:18:22,877 INFO ja/3.11: Build start. 2023-12-10 17:06:49,408 INFO it/3.11: Build start. 2023-12-10 17:40:07,017 INFO id/3.11: Build start. 2023-12-10 18:21:59,182 INFO fr/3.11: Build start. 2023-12-10 18:57:50,425 INFO es/3.11: Build start. 2023-12-10 19:38:27,999 INFO en/3.11: Build start. 2023-12-10 20:09:49,259 INFO zh-tw/3.12: Build start. 2023-12-10 21:13:16,417 INFO zh-cn/3.12: Build start. 2023-12-10 22:14:28,313 INFO uk/3.12: Build start. 2023-12-10 22:18:52,271 INFO tr/3.12: Build start. 2023-12-10 22:52:49,215 INFO pt-br/3.12: Build start. 2023-12-10 23:26:13,913 INFO pl/3.12: Build start. 2023-12-10 23:55:45,872 INFO ko/3.12: Build start. mdk@docs:~$ zgrep 'Build start' /var/log/docsbuild/docsbuild.log.2.gz 2023-12-11 00:38:06,339 INFO ja/3.12: Build start. 2023-12-11 02:19:10,223 INFO it/3.12: Build start. 2023-12-11 02:48:39,326 INFO id/3.12: Build start. 2023-12-11 03:25:56,754 INFO fr/3.12: Build start. 2023-12-11 03:57:07,365 INFO es/3.12: Build start. 2023-12-11 04:33:46,003 INFO en/3.12: Build start. <-- 2023-12-11 05:02:28,158 INFO zh-tw/3.13: Build start. 2023-12-11 06:40:57,160 INFO zh-cn/3.13: Build start. 2023-12-11 08:20:40,118 INFO uk/3.13: Build start. 2023-12-11 08:25:12,725 INFO tr/3.13: Build start. 2023-12-11 09:00:10,942 INFO pt-br/3.13: Build start. 2023-12-11 09:31:30,328 INFO pl/3.13: Build start. 2023-12-11 10:00:34,948 INFO ko/3.13: Build start. 2023-12-11 10:38:35,804 INFO ja/3.13: Build start. 2023-12-11 12:15:57,521 INFO it/3.13: Build start. 2023-12-11 12:46:40,636 INFO id/3.13: Build start. 2023-12-11 13:26:24,106 INFO fr/3.13: Build start. 2023-12-11 13:59:58,381 INFO es/3.13: Build start. 2023-12-11 14:39:16,552 INFO en/3.13: Build start. 2023-12-11 16:07:20,776 INFO zh-tw/3.11: Build start. 2023-12-11 17:04:35,431 INFO zh-cn/3.11: Build start. 2023-12-11 18:01:09,964 INFO uk/3.11: Build start. 2023-12-11 18:05:16,722 INFO tr/3.11: Build start. 2023-12-11 18:38:16,662 INFO pt-br/3.11: Build start. 2023-12-11 19:11:47,677 INFO pl/3.11: Build start. 2023-12-11 19:43:07,048 INFO ko/3.11: Build start. 2023-12-11 20:27:02,135 INFO ja/3.11: Build start. 2023-12-11 22:05:54,678 INFO it/3.11: Build start. 2023-12-11 22:33:47,162 INFO id/3.11: Build start. 2023-12-11 23:09:11,730 INFO fr/3.11: Build start. 2023-12-11 23:42:10,783 INFO es/3.11: Build start. ```

so yes, there's still almost 24h between any builds.

In 14 days we've seen only 45 build being skipped vs 449 rebuilds being needed:

mdk@docs:~$ zgrep 'no rebuild' /var/log/docsbuild/docsbuild.log.*.gz  | wc -l
45
mdk@docs:~$ zgrep 'Should rebuild' /var/log/docsbuild/docsbuild.log.*.gz  | wc -l
449

is is partly due to some translation repos having a cron commiting every day (like transifex pulls) even if there's no new translations in the files.

But the root cause of slow build time is probably more https://github.com/python/docsbuild-scripts/issues/169#issuecomment-1759197196 (latexmk running xelatex in a loop until the result stop changing).

@ewdurbin what "CPU" do we have on the machine? I don't see usefull info on /proc/cpuinfo.

willingc commented 7 months ago

But the root cause of slow build time is probably more https://github.com/python/docsbuild-scripts/issues/169#issuecomment-1759197196 (latexmk running xelatex in a loop until the result stop changing).

How important are the latex builds?

We may want to look at additional ways of optimizing the latex builds. https://blog.martisak.se/2023/10/01/compiling/

JulienPalard commented 7 months ago

How important are the latex builds?

PDF builds are based on LaTeX builds.

When PDFs are unavailable we have user complaints (typically on docs@).

willingc commented 7 months ago

Yeah, we definitely need PDFs.

I wonder if it makes sense to try building PDF with an alternate tool, ~such as pyppeteer which is used by Jupyter book. https://jupyterbook.org/en/stable/advanced/pdf.html#build-a-pdf-from-your-book-html~ playwright-python.

willingc commented 7 months ago

Shotscraper could be another approach: https://shot-scraper.datasette.io/en/stable/pdf.html

methane commented 7 months ago

PDF shoud provide table of contents and index. So ePub is better than html for converting into PDF.

But I don't know good tool to convert ePub to PDF. Maybe, Calibre?

m-aciek commented 7 months ago

What are the versions of latexmk and xetex that are being used on the build server? The latest versions of those packages are 4.81 (2023-11-06) for latexmk and 0.9999.3 (September 2013) for xetex.

JulienPalard commented 7 months ago
mdk@docs:~$ apt-cache policy latexmk
latexmk:
  Installed: 1:4.67-0.1
  Candidate: 1:4.67-0.1
  Version table:
 *** 1:4.67-0.1 500
        500 http://us.archive.ubuntu.com/ubuntu focal/universe amd64 Packages
        100 /var/lib/dpkg/status
mdk@docs:~$ apt-cache policy texlive-xetex
texlive-xetex:
  Installed: 2019.20200218-1
  Candidate: 2019.20200218-1
  Version table:
 *** 2019.20200218-1 500
        500 http://us.archive.ubuntu.com/ubuntu focal/universe amd64 Packages
        100 /var/lib/dpkg/status
ned-deily commented 2 months ago

(One comment: IIRC, at least one reason we do full rebuilds of the docs and not just incremental builds was that there were problems when a documentation page was deleted by a change and then would not automatically get deleted on-line. I'm not sure if that still is an issue.)