mity / md4c

C Markdown parser. Fast. SAX-like interface. Compliant to CommonMark specification.
MIT License
759 stars 140 forks source link

[discussion] evaluate xxhash #93

Closed vtorri closed 4 years ago

vtorri commented 4 years ago

hello

Yann Collet (author of lz4 and contributor of zstd) has written a very fast non cryptographic hash table :

https://github.com/Cyan4973/xxHash

i don't know if it will usefull for md4c, maybe it is worth evaluating it

regards

mity commented 4 years ago

Thanks for the pointer.

However consider the hashing in MD4C is currently used only for managing the link reference definitions. I believe it's not performance critical at all, especially not for a normal input. Furthermore we have to be able to match the labels in a case-agnostic way (and consider Unicode). We currently compute the case folding info on the fly as we compute the hash for it which is reasonably possible with something as simple as FNV. Changing this to converting the whole label and then computing the hash from it as a simple block is theoretically possible but it would likely require malloc(), it could be less friendly to a CPU cache if the link ref. def. label is long, and so it might kill any performance gain of a faster hash algorithm altogether.

As a side note, I am currently aware of the following bottlenecks (when measured with the md2html utility on a "normal" input):

  1. The few lines in md_analyze_line() which together iterate over more-or-less whole document contents in the block-parsing pass.

  2. The few lines in md_collect_marks() which iterates over an ordinary text snippet within a line (which certainly does not contain any meaningful inline Markdown syntax construction). This happens during the inline-parsing pass.

  3. render_html_escaped() which is responsible for most of the output (almost everything except the HTML tags themselves).

AFAIK, depending on the platform and compiler optimization capability, each of these are responsible for something between 15% and 30% of MD4C's runtime. Optimizing further any of those could have a real impact.

Micro-optimizing anything else imho does not make much sense especially if I should spend hours on it or if it would require more complicated code.