locopablo / markdownsharp

Automatically exported from code.google.com/p/markdownsharp
0 stars 0 forks source link

Outdent Performance #24

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Outdent is the sole user of the static _outDent (who would have guessed?), 
doing a replace on the whole match. _outDent however has a capturing group 
inside (which previously held a "\t|[ ]{1,_tabWidth}" and now only consists 
of "[ ]{1,_tabWidth}"). Simply making this group non-capturing (or removing 
the grouping entirely, since it serves no more purpose) improves 
performance a bit.

Another slight performance gain can be achieved by implementing the 
function without regular expressions at all (see attached file for my best 
try).

Although my fastest solution might be overkill for such a negligible 
performance gain, I think the regex should be changed in any case, since it 
hurts me to see performance lying on the streets :)

Perf Original:
input string length: 475
8000 iterations in 5222 ms (0,65275 ms per iteration)
input string length: 2356
2000 iterations in 5003 ms (2,5015 ms per iteration)
input string length: 27737
180 iterations in 4944 ms (27,4666666666667 ms per iteration)
input string length: 11075
375 iterations in 4852 ms (12,9386666666667 ms per iteration)
input string length: 88607
45 iterations in 4693 ms (104,288888888889 ms per iteration)
input string length: 354431
12 iterations in 5125 ms (427,083333333333 ms per iteration)

Perf Non-Capturing Group:
input string length: 475
8000 iterations in 5195 ms (0,649375 ms per iteration)
input string length: 2356
2000 iterations in 4894 ms (2,447 ms per iteration)
input string length: 27737
180 iterations in 4843 ms (26,9055555555556 ms per iteration)
input string length: 11075
375 iterations in 4727 ms (12,6053333333333 ms per iteration)
input string length: 88607
45 iterations in 4635 ms (103 ms per iteration)
input string length: 354431
12 iterations in 4881 ms (406,75 ms per iteration)

Perf w/o regex:
input string length: 475
8000 iterations in 5146 ms (0,64325 ms per iteration)
input string length: 2356
2000 iterations in 4860 ms (2,43 ms per iteration)
input string length: 27737
180 iterations in 4806 ms (26,7 ms per iteration)
input string length: 11075
375 iterations in 4651 ms (12,4026666666667 ms per iteration)
input string length: 88607
45 iterations in 4565 ms (101,444444444444 ms per iteration)
input string length: 354431
12 iterations in 4831 ms (402,583333333333 ms per iteration)

Original issue reported on code.google.com by Shio...@gmail.com on 13 Jan 2010 at 1:17

Attachments:

GoogleCodeExporter commented 9 years ago
good catch -- I think this operation is rare enough that we don't need to hand 
roll a
substitute for the regex.

checked in r105

Original comment by wump...@gmail.com on 13 Jan 2010 at 11:03