andgineer / TRegExpr

Regular expressions (regex), pascal.
https://regex.sorokin.engineer/en/latest/
MIT License
174 stars 62 forks source link

speed up / Use assert to check for corrupt mem #342

Closed User4martin closed 10 months ago

User4martin commented 10 months ago

I wonder if the code in the IFDEF should use Error, or if instead of the IFDEF, it should be an assert?

I overlooked one instance of reeBadOpcodeInCharClass => because at the end of a if/else waterfall it adds no extra time.

I disabled WITH_REGEX_ASSERT in Delphi, as I don't know the "IFOPT" alternative.

Benchmarks

Running 8 repeats for each benchmark. Use -c <n> to change

                                                                  Before   | After     | Match count
==============================================================================
regexpr.pas:
                                                  /Twain/ :          29 ms |     29 ms |       2388
                                              /(?i)Twain/ :          54 ms |     54 ms |       2657
                                             /[a-z]shing/ :         230 ms |    230 ms |       1877
                             /Huck[a-zA-Z]+|Saw[a-zA-Z]+/ :          33 ms |     31 ms |        396
                                                      /./ :         445 ms |    419 ms |   18905427
                                                    /(.)/ :         668 ms |    654 ms |   18905427
                                                      /e/ :          82 ms |     82 ms |    1781425
                                                    /(e)/ :         103 ms |    101 ms |    1781425
                                           /(?s).{1,45} / :          37 ms |     35 ms |     475715
                                         /(?s)\G.{1,45} / :         164 ms |    154 ms |      10616
                                         /\G(?s).{1,45} / :          17 ms |     15 ms |      10616
                                          /(?s).{1,45}? / :         179 ms |    154 ms |    3241534
                                        /(?s)\G.{1,45}? / :         168 ms |    154 ms |      69431
                                        /\G(?s).{1,45}? / :          19 ms |     17 ms |      69431
                                              /\b\w+nn\b/ :         244 ms |    228 ms |        359
                                       /[a-q][^u-z]{13}x/ :         654 ms |    609 ms |       4929
                            /Tom|Sawyer|Huckleberry|Finn/ :          35 ms |     37 ms |       3015
                        /(?i)Tom|Sawyer|Huckleberry|Finn/ :         193 ms |    185 ms |       4820
                    /.{0,2}(Tom|Sawyer|Huckleberry|Finn)/ :        2591 ms |   2417 ms |       3015
                    /.{2,4}(Tom|Sawyer|Huckleberry|Finn)/ :        2781 ms |   2666 ms |       2220
                      /Tom.{10,25}river|river.{10,25}Tom/ :          54 ms |     54 ms |          2
                                           /[a-zA-Z]+ing/ :         615 ms |    585 ms |      95863
                                  /\s[a-zA-Z]{0,12}ing\s/ :         269 ms |    271 ms |      67810
                          /([A-Za-z]awyer|[A-Za-z]inn)\s/ :         634 ms |    640 ms |        313
                              /["'][^"']{0,30}[?!\.]["']/ :          76 ms |     68 ms |       9857
                                /Tom(.{3,3}|.{5,5})*Finn/ :        3121 ms |   2988 ms |         11
                                    /Tom(...|.....)*Finn/ :        1638 ms |   1490 ms |         11
                                   /Tom(...|.....)*?Finn/ :        1568 ms |   1492 ms |         11
                      /Tom((...|.....){2,9}\s){1,5}?Finn/ :        4193 ms |   3953 ms |         11
                      /Tom((...|.....){2,9}?\s){1,5}Finn/ :        4212 ms |   3939 ms |         11
                                        /\G(?is).(?=.*$)/ :         968 ms |    902 ms |   20045118
                                /\G(?is).(?=(.){1,5}?$)?/ :        3580 ms |   3367 ms |   20045118
                                       /\G(?is).(?=.*+$)/ :         945 ms |    888 ms |   20045118
                /\G(?is).{10,10}(?=(e|y|on|fin|.){0,20})/ :        2613 ms |   2000 ms |    2004511
              /\G(?is).{10,10}(?=(?>e|y|on|fin|.){0,20})/ :        2609 ms |   2009 ms |    2004511
                                          /Tom(?!.*Finn)/ :          31 ms |     33 ms |       2441
     /(?i)(?>[aeoui][bcdfghjklmnpqrstvwxyz \r\n]*){1,40}/ :         626 ms |    656 ms |     579843
                          /(?i)[ bdlm][abdegij][mnoprst]/ :         248 ms |    232 ms |     788955
Total:                                                            36740 ms |  33853 ms |