swiftlang / swift-experimental-string-processing

An early experimental general-purpose pattern matching engine for Swift.
Apache License 2.0
278 stars 47 forks source link

DRAFT: Fast paths for large, native strings #646

Open milseman opened 1 year ago

milseman commented 1 year ago

Pulls in a bunch of String-internal bit fiddly code in order to quickly get a pointer to contiguous UTF-8 if we're a (large) native string.

Extract out the Metrics data (unused in release builds), as it seems like having Processor surpass the 256-byte size causes significant regressions for some reason. (see https://github.com/milseman/swift-experimental-string-processing/pull/1).

The below results are post- https://github.com/apple/swift-experimental-string-processing/pull/642 and post- https://github.com/apple/swift-experimental-string-processing/pull/644

=== Regressions ======================================================================
- EmojiRegexAll                           72.9ms    71.9ms  1.03ms      1.4%
- SubtractionCCC                          21.7ms    21.3ms  314µs       1.5%
- BasicCCC                                10.8ms    10.5ms  261µs       2.5%
- CaseInsensitiveCCC                      12ms  11.7ms  255µs       2.2%
- IntersectionCCC                         22.1ms    21.9ms  206µs       0.9%
- BasicRangeCCC                           11.1ms    11ms    196µs       1.8%
- IPv4Address                             2.64ms    2.54ms  99.2µs      3.9%
=== Improvements =====================================================================
- CompilerMessagesAll                     113ms 116ms   -2.71ms     -2.3%
- DiceRollsInTextAll                      47.3ms    49.6ms  -2.23ms     -4.5%
- AnchoredNotFoundWhole                   7.44ms    9.18ms  -1.74ms     -18.9%
- NotFoundAll                             6.29ms    7.09ms  -797µs      -11.2%
- EmailLookaheadAll                       39.3ms    39.9ms  -683µs      -1.7%
- EmailBuiltinCharacterClassAll           14.8ms    15.5ms  -663µs      -4.3%
- ReluctantQuantWithTerminalWhole         8.77ms    9.3ms   -525µs      -5.6%
- LiteralSearchAll                        6.17ms    6.65ms  -478µs      -7.2%
- HangulSyllableAll                       6.35ms    6.76ms  -412µs      -6.1%
- LiteralSearchNotFoundAll                6.05ms    6.42ms  -369µs      -5.8%
- GraphemeBreakNoCapAll                   5.26ms    5.52ms  -264µs      -4.8%
- WordsAll                                14.2ms    14.5ms  -264µs      -1.8%
- symDiffCCC                              48.7ms    49ms    -240µs      -0.5%
- EmailLookaheadList                      9.65ms    9.88ms  -230µs      -2.3%
- CssAll                                  3.63ms    3.85ms  -222µs      -5.8%
- EmailLookaheadNoMatchesAll              40.6ms    40.8ms  -204µs      -0.5%
- HangulSyllableFirst                     3.06ms    3.25ms  -194µs      -6.0%
- DiceNotation                            5.21ms    5.39ms  -179µs      -3.3%
- BasicBuiltinCharacterClassAll           9.02ms    9.13ms  -108µs      -1.2%
- EagarQuantWithTerminalWhole             2.55ms    2.61ms  -62.8µs     -2.4%
- LinesAll                                3.05ms    3.1ms   -42.7µs     -1.4%
- MACAddress                              3ms   3.02ms  -24.7µs     -0.8%
stephentyrone commented 1 year ago

Not sure why that got marked as a review. Ignore.