sile-typesetter / sile

The SILE Typesetter — Simon’s Improved Layout Engine
https://sile-typesetter.org
MIT License
1.63k stars 96 forks source link

SILE can be very slow #177

Closed khaledhosny closed 8 years ago

khaledhosny commented 9 years ago

I don’t know how much of this slowness is expected, but take the following document (using https://github.com/khaledhosny/aref-ruqaa/blob/master/tests/wb.txt):

\begin[papersize=a4]{document}
\script[src=packages/verbatim]
\thisframedirection[direction=RTL]
\font[filename=../arefruqaa-regular.ttf,script=Arabic]
\obeylines
\include[src=wb.txt]
\end{document}

It takes ~25 minutes on my machine, while a comparable XeTeX document takes only ~8 seconds on the same machine:

\input bidi
\setRTL
\font\test="[../arefruqaa-regular.ttf]:script=arab"\test
\obeylines
\input wb.txt
\bye
simoncozens commented 9 years ago

Okay. Part of this is the fact that it's an interpreted language: running it in luajit makes it considerably faster. I'm working on some profiling now.

simoncozens commented 9 years ago

I think what's particularly hammering your case is the O(n²) behaviour of the page builder. The idea is that unlike TeX the page builder is stateless, and so every time it is called, it runs through all the possible break points - including, obviously, those which have been rejected in the past. I will try to add an option to restart the page builder from a known point.

simoncozens commented 9 years ago

OK, a subset of your example (100,000 lines) was 29s on my machine previously; now with the page builder restart fix, it is 19s; and when the Harfbuzz OT functions patch lands, it comes down to 14s. And if you do all that in luajit, it comes down to 8s - so a 72% speedup in total. Should take your 25 minutes down to 7 minutes. Not as competitive as xetex, of course, but you have to pay a price for an interpreted language. And typesetting isn't normally a time-critical operation anyway... so I think that's pretty good for now.

khaledhosny commented 9 years ago

I get a crash with luajit. Also it seems that SILE is using only one core of the 8 I have, not sure if Lua supports multi-threading or how much SILE can make use of it.

simoncozens commented 8 years ago

Today, while trying to work on something else, (cross-space kerning) I discovered that you can get a massive speedup by shipping whole paragraphs at a time to Harfbuzz instead of doing the tokenizing in Lua. (Play with the package packages/harfbuzz-only-shaper.lua)

The code is currently horrific and it reopens the bidi pipelines box of worms in #173, but it is something we want to do anyway, for issues #173 and #138, so we should try and work on this.

simoncozens commented 8 years ago
$ time ./sile examples/test.sil
This is SILE 0.9.3
...
./sile examples/test.sil  15.56s user 0.25s system 98% cpu 16.053 total

$ git checkout master ; make
$ time ./sile examples/test.sil
This is SILE 0.9.4-unreleased
...
./sile examples/test.sil  5.86s user 0.17s system 99% cpu 6.092 total

Implementing the harfbuzz full-paragraph shaping really helped this.