latex3 / tagging-project

Issues related to the tagging project
https://latex3.github.io/tagging-project/
LaTeX Project Public License v1.3c
41 stars 15 forks source link

Tag generation performance #731

Open DavidEGx opened 1 month ago

DavidEGx commented 1 month ago

I've noticed generating tags is quite time consuming.

As an example, running:

$ lualatex latexdoc.txt

(See attached latexdoc.txt. Just asked chatgpt to generate a demo latex file, similar issue with my real latex files)

Takes around 1.6s in my machine. If I remove testphase = {phase-III} it takes ~0.5s.

What is worse, sometimes you have to rerun lualatex.

Is this something that will be addressed in the future? What can we expect? Or is it just me and I'm doing something wrong?!

Thanks for the work.

u-fischer commented 1 month ago

yes tagging slows down the compilation. The code has to create and write quite a lot pdf objects. There are certainly places where the code can be speeded up and this will be done at some time but currently this is not the first priority, the focus is on getting correct tagging at all.

Unrelated but you can/should remove \usepackage[utf8]{inputenc}. For lualatex is does nothing at all, and with pdflatex is it unneeded as utf8 is the default anyway.

DavidEGx commented 1 month ago

Unrelated but you can/should remove \usepackage[utf8]{inputenc}. For lualatex is does nothing at all, and with pdflatex is it unneeded as utf8 is the default anyway.

Thanks, I'll do that.


Maybe this is just totally nonsense but... couldn't tags run only once in a last run?

I mean instead of having to:

Do:

u-fischer commented 1 month ago

Maybe this is just totally nonsense but... couldn't tags run only once in a last run?

With lualatex yes, that will probably be possible. Currently a few things use the aux-file and so need two compilations but it should be possible to change that. But naturally then lualatex needs to know which is the last compilation, so you would have to switch tagging on and off. pdflatex typically really needs two or three compilations with tagging active.

The time consuming part is generally the end when all the objects are written so you could try this while drafting your document:

\AddToHook{enddocument/end}{\tagpdfsetup{activate/tree=false}}
\DocumentMetadata{testphase={phase-III}}

But that is not very much tested, so report back if there are problems or some pdf viewer complains ...

hpvd commented 1 month ago

Thanks for raising this performance topic. It is imho really important for "usability" and with this also for acceptance and adaption rate of tagging. Maybe it is reasonable to think about things like coupling it to something like the draft mode or similar or even thinking about extending / structure the possibilities of the draft mode (or a new mode?) to have this one customizable switch (draft setup?) to select between "fast" compile and "final" compile which loads the appropriate config for

(not only tagging) ...

DavidEGx commented 1 month ago

BTW, is there some sort of approximate date for the release of tags?

u-fischer commented 1 month ago

BTW, is there some sort of approximate date for the release of tags?

Sorry I don't understand the question.

DavidEGx commented 1 month ago

BTW, is there some sort of approximate date for the release of tags?

Sorry I don't understand the question.

Since we have to use the key "testphase" I assume tags are at beta ish state. So the question is when it will be not beta.

FrankMittelbach commented 1 month ago

The project is planned as a multi-year one (aprox 5 if there is sufficient financial support, otherwise possibly longer). There is a lot of documentation around this at https://www.latex-project.org/publications/indexbytopic/pdf/ , including why it needs that amount of time or more in the schedule. But because of the uncertainties it is not possible to give a reliable final date.

Regardless of that we are now in a position where it can already be actively used even though it is still evolving and needs a lot more work.