Closed kzykhys closed 10 years ago
If I understand the source correctly, every extension runs its own regular expressions on either the full text, or on every line.
If performance will be one of your goals, I think you should consider not spending too much time improving the current architecture, but rather try to build a system that first tokenizes, and then transforms it into html.
An entire tokenizer could be written by a lot less regular expression, and each individual extension could add it's on regular expression to one big regex. There may be some exceptions to that rule if there's any non-regular syntax in markdown, but I'd imagine you'd get quite far with just that...
just a thought anyway...
I agree. Current architecture depends on too much regular expressions, that might actually slow things down. also call_user_func
.
I tried to implement the Ciconia using lexer and parser pattern before, then failed...
$ php bin/markbench benchmark --profile=github-sample
Runtime: PHP5.5.3
Host: Linux vm1 3.8.0-31-generic #46-Ubuntu SMP Tue Sep 10 20:03:44 UTC 2013 x86_64
Profile: Sample content from Github (http://github.github.com/github-flavored-markdown/sample_content.html) / 1000 times
Class: Markbench\Profile\GithubSampleProfile
+----------------------+------------+------------+---------------+---------+--------------+
| package | version | dialect | duration (MS) | MEM (B) | PEAK MEM (B) |
+----------------------+------------+------------+---------------+---------+--------------+
| erusev/parsedown | 0.4.7 | | 12095 | 6291456 | 6553600 |
| michelf/php-markdown | 1.3 | | 38704 | 6815744 | 7077888 |
| michelf/php-markdown | 1.3 | extra | 51304 | 6815744 | 7340032 |
| kzykhys/ciconia | dev-master | | 64837 | 7340032 | 7602176 |
| kzykhys/ciconia | dev-master | gfm | 68255 | 7340032 | 7602176 |
+----------------------+------------+------------+---------------+---------+--------------+