Closed khaledhosny closed 8 years ago
Okay. Part of this is the fact that it's an interpreted language: running it in luajit
makes it considerably faster. I'm working on some profiling now.
I think what's particularly hammering your case is the O(n²) behaviour of the page builder. The idea is that unlike TeX the page builder is stateless, and so every time it is called, it runs through all the possible break points - including, obviously, those which have been rejected in the past. I will try to add an option to restart the page builder from a known point.
OK, a subset of your example (100,000 lines) was 29s on my machine previously; now with the page builder restart fix, it is 19s; and when the Harfbuzz OT functions patch lands, it comes down to 14s. And if you do all that in luajit, it comes down to 8s - so a 72% speedup in total. Should take your 25 minutes down to 7 minutes. Not as competitive as xetex, of course, but you have to pay a price for an interpreted language. And typesetting isn't normally a time-critical operation anyway... so I think that's pretty good for now.
I get a crash with luajit. Also it seems that SILE is using only one core of the 8 I have, not sure if Lua supports multi-threading or how much SILE can make use of it.
Today, while trying to work on something else, (cross-space kerning) I discovered that you can get a massive speedup by shipping whole paragraphs at a time to Harfbuzz instead of doing the tokenizing in Lua. (Play with the package packages/harfbuzz-only-shaper.lua
)
The code is currently horrific and it reopens the bidi pipelines box of worms in #173, but it is something we want to do anyway, for issues #173 and #138, so we should try and work on this.
$ time ./sile examples/test.sil
This is SILE 0.9.3
...
./sile examples/test.sil 15.56s user 0.25s system 98% cpu 16.053 total
$ git checkout master ; make
$ time ./sile examples/test.sil
This is SILE 0.9.4-unreleased
...
./sile examples/test.sil 5.86s user 0.17s system 99% cpu 6.092 total
Implementing the harfbuzz full-paragraph shaping really helped this.
I don’t know how much of this slowness is expected, but take the following document (using https://github.com/khaledhosny/aref-ruqaa/blob/master/tests/wb.txt):
It takes ~25 minutes on my machine, while a comparable XeTeX document takes only ~8 seconds on the same machine: