swiftlang / swift-experimental-string-processing

An early experimental general-purpose pattern matching engine for Swift.
Apache License 2.0
278 stars 47 forks source link

Treat capture 0 (i.e. the whole match) specially #777

Closed milseman closed 1 week ago

milseman commented 2 weeks ago

Rather than have the whole-match capture be a stored capture, we handle it specially. This speeds up processor resets as we do not need to reset a stored capture (especially when the regex has no other captures).

It also speeds up the creation of save points and backtracking, as it's one less capture to save/restore.

Built on top of https://github.com/swiftlang/swift-experimental-string-processing/pull/776

milseman commented 2 weeks ago

Perf improvement on top of https://github.com/swiftlang/swift-experimental-string-processing/pull/776:

Comparing against saved benchmark result faster_resets
=== Regressions ======================================================================
- CompilerMessages_All_Scalar             72.5ms    71ms    1.51ms      2.1%
- MACAddress                              2.54ms    2.51ms  32.6µs      1.3%
- EagarQuantWithTerminal_Whole            472µs 448µs   23.8µs      5.3%
=== Improvements =====================================================================
- ReluctantQuant_Whole_Scalar             3.43ms    9.87ms  -6.44ms     -65.3%
- ReluctantQuant_Whole                    3.43ms    9.84ms  -6.41ms     -65.2%
- DiceRollsInText_All_Scalar              25.4ms    28ms    -2.66ms     -9.5%
- EmailRFCNoMatches_All_Scalar            42.3ms    44.9ms  -2.54ms     -5.7%
- DiceRollsInText_All                     27.9ms    30.1ms  -2.24ms     -7.4%
- EmailLookaheadNoMatches_All_Scalar      18.2ms    20.2ms  -2.06ms     -10.2%
- EmailRFCNoMatches_All                   47.1ms    48.9ms  -1.76ms     -3.6%
- InvertedCCC_All                         5.27ms    6.88ms  -1.61ms     -23.4%
- InvertedCCC_All_Scalar                  5.22ms    6.79ms  -1.57ms     -23.1%
- EmailLookaheadNoMatches_All             20.8ms    22.3ms  -1.52ms     -6.8%
- NotFound_All_Scalar                     2.33ms    3.71ms  -1.37ms     -37.0%
- Words_All_Scalar                        3.89ms    5.2ms   -1.31ms     -25.1%
- LiteralSearchNotFound_All_Scalar        1.92ms    3.15ms  -1.23ms     -39.1%
- Words_All                               4.36ms    5.59ms  -1.23ms     -22.0%
- LiteralSearch_All_Scalar                2.07ms    3.28ms  -1.21ms     -36.9%
- EmailRFC_All_Scalar                     30.5ms    31.6ms  -1.11ms     -3.5%
- EmailRFC_All                            31.8ms    32.8ms  -1.04ms     -3.2%
- NotFound_All                            3.82ms    4.82ms  -1ms        -20.7%
- LiteralSearch_All                       3.13ms    4.12ms  -986µs      -23.9%
- CompilerMessages_All                    89.1ms    90.1ms  -952µs      -1.1%
- LiteralSearchNotFound_All               2.99ms    3.93ms  -943µs      -24.0%
- BasicBuiltinCharacterClass_All          3.14ms    4.03ms  -898µs      -22.3%
- BasicRangeCCC_All_Scalar                3.16ms    4.04ms  -887µs      -21.9%
- BasicCCC_All                            3.06ms    3.93ms  -873µs      -22.2%
- HangulSyllable_All_Scalar               2.31ms    3.18ms  -873µs      -27.4%
- BasicRangeCCC_All                       3.18ms    4.06ms  -872µs      -21.5%
- CaseInsensitiveCCC_All                  3.4ms 4.27ms  -870µs      -20.4%
- CaseInsensitiveCCC_All_Scalar           3.41ms    4.28ms  -869µs      -20.3%
- AnchoredNotFound_First                  8.94ms    9.81ms  -867µs      -8.8%
- EmailBuiltinCharacterClass_All          8.36ms    9.21ms  -855µs      -9.3%
- Numbers_All                             2.58ms    3.42ms  -843µs      -24.6%
- BasicCCC_All_Scalar                     3.07ms    3.91ms  -839µs      -21.4%
- EmailLookahead_All                      19.2ms    20ms    -805µs      -4.0%
- BasicBuiltinCharacterClass_All_Scalar   2.28ms    3.05ms  -769µs      -25.2%
- AnchoredNotFound_All                    9.02ms    9.78ms  -760µs      -7.8%
- EmailBuiltinCharacterClass_All_Scalar   8.21ms    8.97ms  -755µs      -8.4%
- HangulSyllable_All                      3.17ms    3.92ms  -750µs      -19.2%
- IntersectionCCC_All_Scalar              5.25ms    5.99ms  -733µs      -12.2%
- IntersectionCCC_All                     5.22ms    5.95ms  -733µs      -12.3%
- symDiffCCC_All_Scalar                   17.4ms    18.1ms  -713µs      -3.9%
- EmailLookahead_All_Scalar               17.1ms    17.7ms  -669µs      -3.8%
- symDiffCCC_All                          17.4ms    18.1ms  -655µs      -3.6%
- SubtractionCCC_All                      5.66ms    6.31ms  -654µs      -10.4%
- SubtractionCCC_All_Scalar               5.66ms    6.31ms  -650µs      -10.3%
- Numbers_All_Scalar                      2ms   2.63ms  -630µs      -24.0%
- EmojiRegex_All_Scalar                   43.2ms    43.7ms  -567µs      -1.3%
- HangulSyllable_First_Scalar             935µs 1.49ms  -554µs      -37.2%
- HangulSyllable_First                    1.42ms    1.87ms  -448µs      -24.0%
- DiceNotation_Scalar                     4.14ms    4.55ms  -416µs      -9.1%
- DiceNotation                            4.37ms    4.77ms  -403µs      -8.5%
- EmailLookaheadList                      4.03ms    4.29ms  -255µs      -6.0%
- URLWithWordBoundaries_All               3.52ms    3.78ms  -255µs      -6.8%
- AnchoredNotFound_First_Scalar           5.68ms    5.91ms  -229µs      -3.9%
- AnchoredNotFound_All_Scalar             5.71ms    5.93ms  -219µs      -3.7%
- IPv4Address_Scalar                      1.97ms    2.19ms  -218µs      -10.0%
- IPv6Address                             2.41ms    2.62ms  -203µs      -7.8%
- IPv6Address_Scalar                      2.24ms    2.44ms  -199µs      -8.2%
- GraphemeBreakNoCap_All                  1.58ms    1.78ms  -199µs      -11.2%
- Css_All                                 2.35ms    2.55ms  -198µs      -7.8%
- IPv4Address                             2.11ms    2.31ms  -196µs      -8.5%
- Lines_All_Scalar                        713µs 888µs   -175µs      -19.7%
- GraphemeBreakNoCap_All_Scalar           1.5ms 1.67ms  -174µs      -10.4%
- EmailLookaheadList_Scalar               3.93ms    4.11ms  -172µs      -4.2%
- Lines_All                               769µs 930µs   -161µs      -17.3%
- Css_All_Scalar                          2.01ms    2.17ms  -157µs      -7.2%
- ReluctantQuantWithTerminal_Whole_Scalar 5.77ms    5.92ms  -151µs      -2.6%
- ReluctantQuantWithTerminal_Whole        5.78ms    5.92ms  -139µs      -2.3%
- URLWithWordBoundaries_All_Scalar        3.36ms    3.45ms  -84.9µs     -2.5%
- URLWithWordBoundaries_All_SimpleWordBoundaries 595µs  674µs   -79.2µs     -11.7%
milseman commented 1 week ago

@swift-ci please test

milseman commented 1 week ago

@swift-ci please test