Open jgm opened 5 years ago
None of this achieved any speed improvement over the current version using [Tok]
; indeed, in every case performance was worse.
Profiling reveals that block structure parsing is fast. Most of the time is taken up by tokenize
and restOfLine
(31%), and by inline parsing.
make prof
Current results (March 12 2020):
1.8 parseChunks
2.1 pDelimChunk
2.2 Commonmark.Blocks.runInlineParser
2.5 blockContinues
2.6 Commonmark.Inlines.processBs
2.9 MAIN
3.9 block_starts
6.6 renderHtml
9.0 pSymbol
11.9 defaultInlineParser
17.5 Commonmark.Tokens.tokenize
32.6 restOfLine
For a 1.4MB file:
Benchmarks for different extensions:
extension | mean |
---|---|
-xautolinks | 310.8 ms (309.3 ms .. 311.3 ms) |
-xpipe_tables | 295.2 ms (293.2 ms .. 296.6 ms) |
-xstrikethrough | 267.9 ms (265.6 ms .. 269.1 ms) |
-xsuperscript | 267.8 ms (264.9 ms .. 269.5 ms) |
-xsubscript | 266.8 ms (263.6 ms .. 267.9 ms) |
-xsmart | 293.0 ms (292.0 ms .. 294.3 ms) |
-xmath | 287.4 ms (285.4 ms .. 290.7 ms) |
-xemoji | 281.6 ms (280.3 ms .. 282.8 ms) |
-xfootnotes | 291.3 ms (286.1 ms .. 293.3 ms) |
-xdefinition_lists | 272.6 ms (271.0 ms .. 275.4 ms) |
-xfancy_lists | 271.2 ms (269.3 ms .. 273.8 ms) |
-xattributes | 284.2 ms (283.4 ms .. 285.7 ms) |
-xraw_attribute | 280.7 ms (279.6 ms .. 281.6 ms) |
-xbracketed_spans | 268.5 ms (267.0 ms .. 269.4 ms) |
-xfenced_divs | 269.6 ms (267.5 ms .. 271.6 ms) |
-xauto_identifiers | 274.9 ms (273.0 ms .. 277.8 ms) |
-ximplicit_heading_references | 269.8 ms (268.2 ms .. 272.8 ms) |
-xall | 520.4 ms (515.5 ms .. 523.6 ms) |
One idea to explore: use ShortText
from text-short
package instead of Text
in Tok
.
The public API could still use Text
.
This should reduce the memory used by the tokens.
See notes on performance in the README.md.