Closed scottkosty closed 10 months ago
A not-too-obstinate bisect seems to indicate origin of this (which I confirm) is in merge of https://github.com/latex3/latex2e/pull/1020. More precisely, I do not see the phenomenon with commit https://github.com/latex3/latex2e/commit/0de56d73 and I see it with commit https://github.com/latex3/latex2e/commit/913af0e2. There are quite a few commits in-between it seems, but I could not compile document. Will try again.
well actually only about 17 intermediate commits but it seems none of them allow to build document without error
@PhelypeOleinik
well actually only about 17 intermediate commits but it seems none of them allow to build document without error
@jfbu There are actually (according to the PR page) 149 commits. It was a quite long (code-wise, and time-wise :) PR, started with commit 0f4101b and finished with 9833018. I'd be surprised if anything in between worked at all.
The problem is that the code doubles hashes from the input using \__hook_double_hashes:n
, which is rather slow. I couldn't think of another way of doubling all the parameter tokens in a token list without the painfully slow token-by-token approach. I could do with retokenisation, but that would be unreliable.
I am working on some shortcuts for the code which will avoid \__hook_double_hashes:n
in hooks that take no arguments. This should solve the problem at hand, but it won't cut it for all cases.
The gist of the problem, if you want to lose some hair over it too :), is that I have to turn:
\AddToHook{foo}{Hashes doubled: # and #1.}
\AddToHookWithArguments{foo}{Hashes preserved: ## and #1.}
essentially into:
\def\hookfoo#1{%
Hashes doubled: ## and ##1.
Hashes preserved: ## and #1.
}
Maybe my head is already biased, but I couldn't find an actual good solution.
The gist of the problem, if you want to lose some hair over it too :),
I did lose some hair right now...
is that I have to turn:
\AddToHook{foo}{Hashes doubled: # and #1.} \AddToHookWithArguments{foo}{Hashes preserved: ## and #1.}
essentially into:
\def\hookfoo#1{% Hashes doubled: ## and ##1. Hashes preserved: ## and #1. }
Thanks for making this very concrete. I need to read ltnews37 and your lthooks comments to better understand why you need to achieve such goal. It seems only if \edef
is allowed on \AddToHookWithArguments
second argument do I have a way.
Pasting it here fwiw:
\def\foo{}
\def\AddToHook#1{\def\tempName{#1}\afterassignment\AddToHookB\toks0}
\def\AddToHookB{\expandafter\edef\csname\tempName\endcsname{\the\toks0}}
\AddToHook{foo}{Hashes doubled: # and #1.}
\show\foo
\def\AddToHookWithArguments#1{%
\def\tempName{#1}%
\toks0\expandafter\expandafter\expandafter{\csname#1\endcsname}%
\afterassignment\AddToHookWithArgsB\let\next=%
}
\def\AddToHookWithArgsB{\expandafter
\edef\csname\tempName\endcsname##1{\the\toks0 \iffalse}\fi
}
\AddToHookWithArguments{foo}{Hashes preserved: ## and #1.}
\show\foo
\def\hookfoo#1{%
Hashes doubled: ## and ##1.%
Hashes preserved: ## and #1.%
}
\show\hookfoo
\ifx\foo\hookfoo\else\ERROR\fi
\bye
Skimming through lthooks.dtx I see two identical macro comments blocks, probably this duplication is result of some merge accident? https://github.com/latex3/latex2e/blob/06733240f2a6eed725752345735501682bc4a6f0/base/lthooks.dtx#L4700-L4716 and https://github.com/latex3/latex2e/blob/06733240f2a6eed725752345735501682bc4a6f0/base/lthooks.dtx#L4717-L4733
It seems only if
\edef
is allowed on\AddToHookWithArguments
second argument do I have a way.
The problem with \edef
is that it expands the code to be added, and I can't do that. All the code should be added to the hook as-is, except for the doubling of parameter tokens.
Skimming through lthooks.dtx I see two identical macro comments blocks, probably this duplication is result of some merge accident?
No unless my name is "merge accident" :) I just copied the comment block and forgot to update the actual text in it. I'll fix. Thanks!
I just wanted to say thank you for all the work! The technical discussion is over my head, but I appreciate your time on this. If there is a fix, I can test it on the original use case, which if I recall correctly took about 24 minutes to compile with the -dev
command, compared to (the non-dev xelatex
) 30 seconds or so.
\def\AddToHookWithArgsB{\expandafter \edef\csname\tempName\endcsname##1{\the\toks0 \iffalse}\fi }
nice try, but the problem is that it can't be an \edef
the code argument of \AddToHookWithArguments
has to be added in without expansion.
I just wanted to say thank you for all the work! The technical discussion is over my head, but I appreciate your time on this. If there is a fix, I can test it on the original use case, which if I recall correctly took about 24 minutes to compile with the
-dev
command, compared to (the non-devxelatex
) 30 seconds or so.
I presume that your use case makes an awful lot of language changes. There is a problem with the polyglossia code: it writes every language change into the .aux file in a way that the aux file then adds multiple identical code pieces in the begindocument hook. Together with the deficiency discussed above this then blows up badly. Put
\makeatletter
\write\@mainaux{\noexpand\ShowHook{begindocument}}
\makeatother
just before \end{document}
to see them (already many lines and your MWE has only a few language changes).
For what it's worth, I have just submitted a fix to polyglossia
's gloss-french
that should at least cure the absurd flooding of the\AtEndPreamble
hook with redundant redefinitions: https://github.com/reutenauer/polyglossia/commit/8e000c33437a8041fdbd30a517b017bfc58c5ec8
There are presumably more profound things to do, but I hope that this at least crashes the most severe bottleneck of the case reported here by Scott.
(the plan is to do a polyglossia
release with this fix immediately after the next LaTeX release, as there are some more changes that follow-up on the \BCPdata
changes)
This proof-of-concept stores content of a hook with arguments in several inner macros:
\documentclass{article}
\usepackage{etoolbox}
% \NewHookWithArguments{foo}{1}
\def\foo#1{{0}}
% \AddToHook{foo}{Hashes doubled: # and #1.}
\def\foo#1{{1}\csuse{foo1}}
\csedef{foo1}{\unexpanded\expandafter{Hashes doubled: # and #1.}}
\csshow{foo1}
% \AddToHookWithArguments{foo}{Hashes preserved: ## and #1.}
\def\foo#1{{2}\csuse{foo1}\csuse{foo2}{#1}}
\csdef{foo2}#1{Hashes preserved: ## and #1.}
\csshow{foo2}
\endinput
@muzimuzhi Maybe that would be an alternative. I don't know about "hiding" the code chunks in separate macros. I'll try. Thanks!
The downside is a) runtime cost (not high but there) and b) the need for book keeping because when rules alter the content or \Add...Next is used one should drop the old content. On the other hand the setup would definitely be quick so that may balance out nicely.
@scottkosty The issue in your initial posting is resolved, but I'll leave this open because there are more related performance issues that I want to fix, so I'll use this issue to track those.
@PhelypeOleinik Sounds good! Thank you for adding a test file also. I know it's hard to add regression tests for performance. The only reason this issue happened to cause a test failure for me is that I have a timeout set, but it was lucky that it triggered.
This issue has been automatically marked as stale because it has not had recent activity.
@PhelypeOleinik This issue still live?
@josephwright Not this one, in particular. There are other performance improvements that can be done to the hook code, but it's likely better to address those in specific issues.
Brief outline of the bug
Below is an example that takes 8 seconds on the second run of
xelatex-dev
, but 2 seconds on the second run ofxelatex
. Similar difference forlualatex-dev
andlualatex
. This is on an up-to-date TeX Live 2023 system.The issue is a performance one, and it only shows up on the second run. When testing, make sure to first delete the .aux file, and then run
xelatex-dev
twice. Can anyone reproduce the difference in performance?