swiftlang / swift-experimental-string-processing

An early experimental general-purpose pattern matching engine for Swift.
Apache License 2.0
278 stars 47 forks source link

[DRAFT] Refactor quant nonmutating #707

Open milseman opened 10 months ago

milseman commented 10 months ago

Refactor off of mutating methods

Refactors mutating methods into string methods for easier unit testing and parity-checking via assertions. Prepares for more efficient implementations.

Doing so creates many small-to-medium regressions, unfortunately, so this should only be done in conjunction with more refactorings and improvements.

milseman commented 10 months ago

This is built on top of https://github.com/apple/swift-experimental-string-processing/pull/706, which has many improvements.

This refactoring, compared to just https://github.com/apple/swift-experimental-string-processing/pull/706, has the following perf and regressions. They could be ARC-related, haven't explored exactly what's going on here:

=== Regressions ======================================================================
- CompilerMessages_All_Scalar             73.7ms    69.9ms  3.77ms      5.4%
- EmailRFCNoMatches_All_Scalar            126ms 122ms   3.74ms      3.1%
- CompilerMessages_All                    90.1ms    86.7ms  3.42ms      3.9%
- SubtractionCCC_All                      23ms  20.9ms  2.07ms      9.9%
- DiceRollsInText_All                     42.9ms    41.1ms  1.82ms      4.4%
- BasicCCC_All                            7.32ms    5.93ms  1.38ms      23.3%
- EmojiRegex_All_Scalar                   41.9ms    40.5ms  1.37ms      3.4%
- BasicCCC_All_Scalar                     7.29ms    5.93ms  1.36ms      22.9%
- AnchoredNotFound_All                    14.8ms    13.5ms  1.33ms      9.8%
- DiceRollsInText_All_Scalar              39.9ms    38.6ms  1.31ms      3.4%
- BasicRangeCCC_All                       7.44ms    6.38ms  1.07ms      16.7%
- EmailLookaheadNoMatches_All             24ms  23ms    984µs       4.3%
- InvertedCCC_All_Scalar                  17.6ms    16.7ms  926µs       5.5%
- Numbers_All                             6.73ms    5.8ms   923µs       15.9%
- CaseInsensitiveCCC_All                  7.23ms    6.34ms  893µs       14.1%
- BasicRangeCCC_All_Scalar                7.29ms    6.4ms   884µs       13.8%
- CaseInsensitiveCCC_All_Scalar           7.22ms    6.35ms  871µs       13.7%
- EmailRFC_All                            64.8ms    64ms    820µs       1.3%
- EmailLookaheadNoMatches_All_Scalar      21.4ms    20.7ms  735µs       3.6%
- NotFound_All_Scalar                     6.82ms    6.18ms  643µs       10.4%
- NotFound_All                            7.67ms    7.2ms   468µs       6.5%
- MACAddress                              2.81ms    2.38ms  430µs       18.0%
- EmailRFC_All_Scalar                     45.4ms    45ms    425µs       0.9%
- IntersectionCCC_All                     22.2ms    21.8ms  415µs       1.9%
- SubtractionCCC_All_Scalar               21.4ms    21ms    384µs       1.8%
- AnchoredNotFound_First                  9.13ms    8.78ms  358µs       4.1%
- Numbers_All_Scalar                      5.66ms    5.34ms  319µs       6.0%
- IntersectionCCC_All_Scalar              22.2ms    21.9ms  318µs       1.5%
- Words_All                               12.7ms    12.4ms  316µs       2.5%
- LiteralSearchNotFound_All               6.71ms    6.44ms  271µs       4.2%
- EmailLookahead_All                      19.6ms    19.3ms  263µs       1.4%
- HangulSyllable_All                      6.92ms    6.66ms  261µs       3.9%
- MACAddress_Scalar                       2.49ms    2.25ms  232µs       10.3%
- IPv6Address                             2.78ms    2.56ms  223µs       8.7%
- IPv6Address_Scalar                      2.61ms    2.39ms  216µs       9.0%
- EmailLookahead_All_Scalar               17.3ms    17.1ms  205µs       1.2%
- LiteralSearch_All                       6.89ms    6.7ms   195µs       2.9%
- IPv4Address_Scalar                      2.24ms    2.06ms  186µs       9.0%
- HangulSyllable_All_Scalar               6.11ms    5.92ms  185µs       3.1%
- LiteralSearchNotFound_All_Scalar        5.68ms    5.51ms  172µs       3.1%
- GraphemeBreakNoCap_All_Scalar           2.99ms    2.83ms  161µs       5.7%
- GraphemeBreakNoCap_All                  3.1ms 2.95ms  149µs       5.0%
- EmailBuiltinCharacterClass_All_Scalar   10.3ms    10.2ms  147µs       1.4%
- ReluctantQuantWithTerminal_Whole        5.41ms    5.27ms  147µs       2.8%
- LiteralSearch_All_Scalar                5.87ms    5.73ms  145µs       2.5%
- ReluctantQuantWithTerminal_Whole_Scalar 5.34ms    5.21ms  123µs       2.4%
- Css_All_Scalar                          2.99ms    2.88ms  117µs       4.1%
- IPv4Address                             2.32ms    2.22ms  105µs       4.7%
- EagarQuantWithTerminal_Whole_Scalar     520µs 441µs   79.6µs      18.1%
- Css_All                                 3.31ms    3.23ms  76.2µs      2.4%
- DiceNotation                            4.58ms    4.51ms  75.1µs      1.7%
- EagarQuantWithTerminal_Whole            520µs 451µs   69.2µs      15.3%
- DiceNotation_Scalar                     4.33ms    4.26ms  65.5µs      1.5%
- Lines_All_Scalar                        1.73ms    1.67ms  64µs        3.8%
- Lines_All                               1.77ms    1.72ms  53.1µs      3.1%
=== Improvements =====================================================================
- EmailLookaheadList_Scalar               3.59ms    3.96ms  -370µs      -9.3%
- EmailBuiltinCharacterClass_All          10.5ms    10.7ms  -166µs      -1.6%
- AnchoredNotFound_First_Scalar           5.47ms    5.62ms  -153µs      -2.7%
- EmailLookaheadList                      3.81ms    3.87ms  -54.8µs     -1.4%