Closed LucianoPAlmeida closed 2 years ago
(Hope you don't mind a drive by nerd snipe.)
I took the liberty of doing some benchmarking here. Each case was 10,000 loops checking isAllWhitespace
against a String
of 10,000 characters. First case is all space, second is all a
s, and third is 5,000 a
s with 2,500 spaces on either end.
extension String {
func isAllWhitespace() -> Bool {
trimmingCharacters(in: .whitespacesAndNewlines) == ""
}
// func isAllWhitespace() -> Bool {
// allSatisfy { $0.isNewline || $0.isWhitespace }
// }
}
let allWhitespace = String(repeating: " ", count: 10_000)
//let noWhitespace = String(repeating: "a", count: 10_000)
//let mixed = String(repeating: " ", count: 2500) + String(repeating: "a", count: 5000) + String(repeating: " ", count: 2500)
var results: [Bool] = []
results.reserveCapacity(10_000)
for _ in 0..<10_000 {
results.append(allWhitespace.isAllWhitespace())
}
precondition(results.allSatisfy { $0 == true })
Here are the results.
jshier@Jons-iMac isWhitespace % hyperfine './allWhitespaceTrimming'
Benchmark 1: ./allWhitespaceTrimming
Time (mean ± σ): 920.7 ms ± 5.6 ms [User: 911.1 ms, System: 8.5 ms]
Range (min … max): 911.2 ms … 927.5 ms 10 runs
jshier@Jons-iMac isWhitespace % hyperfine './allWhitespaceAllSatisfy'
Benchmark 1: ./allWhitespaceAllSatisfy
Time (mean ± σ): 2.143 s ± 0.004 s [User: 2.122 s, System: 0.019 s]
Range (min … max): 2.139 s … 2.153 s 10 runs
jshier@Jons-iMac isWhitespace % hyperfine './noWhitespaceTrimming'
Benchmark 1: ./noWhitespaceTrimming
Time (mean ± σ): 6.8 ms ± 0.2 ms [User: 5.3 ms, System: 1.0 ms]
Range (min … max): 6.5 ms … 7.4 ms 318 runs
jshier@Jons-iMac isWhitespace % hyperfine './noWhitespaceAllSatisfy'
Benchmark 1: ./noWhitespaceAllSatisfy
Time (mean ± σ): 3.5 ms ± 0.1 ms [User: 2.1 ms, System: 0.9 ms]
Range (min … max): 3.2 ms … 4.1 ms 489 runs
Warning: Command took less than 5 ms to complete. Results might be inaccurate.
jshier@Jons-iMac isWhitespace % hyperfine './mixedTrimming'
Benchmark 1: ./mixedTrimming
Time (mean ± σ): 492.6 ms ± 3.9 ms [User: 474.9 ms, System: 16.7 ms]
Range (min … max): 485.7 ms … 499.4 ms 10 runs
jshier@Jons-iMac isWhitespace % hyperfine './mixedAllSatisfy'
Benchmark 1: ./mixedAllSatisfy
Time (mean ± σ): 574.6 ms ± 1.9 ms [User: 567.6 ms, System: 5.9 ms]
Range (min … max): 571.1 ms … 577.6 ms 10 runs
This isn't exhaustive, but as you can see allSatisfy
only beats trimmingCharacters
when the string is entirely whitespace. I can't quite explain why allSatisfy
is so much slower in the longer case (perhaps checking the properties is slower than set inclusion?) but for larger strings, the more whitespace they contain the bigger the trimmingCharacters
advantage. Crossover in the test above seems to be about 1,000 spaces padding 8,000 a
s.
You can also take a look at the rough source of trimmingCharacters
and see that it operates on the underlying String
buffer and subranges until it needs to materialize the result, it doesn't directly mutate or otherwise create copies until then.
In the end, depending on typical use, this may be a useful change for small strings of that usually not whitespace, but I'm not sure the difference on the small end is work the larger difference on the high end. That's up to the project.
I wonder if benchmarks could show the same performance difference on Linux and Windows, since closed-source Foundation on Mac may behave in unexpected ways.
Also, if we could infer some typical length of strings where one method is beneficial over the other, maybe we could keep both applying them conditionally?
Thank you @jshier for running benchmarks, as always perf hints can be really surprising when we measure it.
I can't quite explain why allSatisfy is so much slower in the longer case (perhaps checking the properties is slower than set inclusion?) but for larger strings, the more whitespace they contain the bigger the trimmingCharacters advantage.
Changed to operate directly on unicode view using char set, seems to close the huge gap...
lucianoalmeida@Lucianos-Mac-mini testing % hyperfine ./allWhiteSpaceAllsatisfy --prepare true
Benchmark 1: ./allWhiteSpaceAllsatisfy
Time (mean ± σ): 754.6 ms ± 0.2 ms [User: 752.6 ms, System: 1.4 ms]
Range (min … max): 754.4 ms … 754.9 ms 10 runs
lucianoalmeida@Lucianos-Mac-mini testing % hyperfine ./allWhiteSpaceTrimmed --prepare true
Benchmark 1: ./allWhiteSpaceTrimmed
Time (mean ± σ): 806.3 ms ± 1.1 ms [User: 804.1 ms, System: 1.6 ms]
Range (min … max): 805.3 ms … 808.2 ms 10 runs
lucianoalmeida@Lucianos-Mac-mini testing % hyperfine ./noWhitespacesAllsatisfy --prepare true
Benchmark 1: ./noWhitespacesAllsatisfy
Time (mean ± σ): 59.7 ms ± 169.7 ms [User: 5.7 ms, System: 1.2 ms]
Range (min … max): 3.5 ms … 542.7 ms 10 runs
Warning: Command took less than 5 ms to complete. Results might be inaccurate.
lucianoalmeida@Lucianos-Mac-mini testing % hyperfine ./noWhiteSpaceTrimmed --prepare true
Benchmark 1: ./noWhiteSpaceTrimmed
Time (mean ± σ): 23.7 ms ± 60.6 ms [User: 6.7 ms, System: 0.8 ms]
Range (min … max): 5.1 ms … 225.2 ms 13 runs
lucianoalmeida@Lucianos-Mac-mini testing % hyperfine ./mixedAllsatisfy --prepare true
Benchmark 1: ./mixedAllsatisfy
Time (mean ± σ): 215.6 ms ± 75.8 ms [User: 194.4 ms, System: 0.9 ms]
Range (min … max): 191.3 ms … 431.3 ms 10 runs
lucianoalmeida@Lucianos-Mac-mini testing % hyperfine ./mixedWhiteSpaceTrimmed --prepare true
Benchmark 1: ./mixedWhiteSpaceTrimmed
Time (mean ± σ): 441.5 ms ± 78.3 ms [User: 414.3 ms, System: 5.5 ms]
Range (min … max): 416.3 ms … 664.4 ms 10 runs
Then I think is just see if is worth for the project, as mixed also seems to be an improvement
Did all of these benchmarks run in release mode? I still would love to see how it performs on Linux and Windows, but overall this is definitely something I'd consider merging 👍
I wonder if benchmarks could show the same performance difference on Linux and Windows, since closed-source Foundation on Mac may behave in unexpected ways.
It should be possible since this tool work in Linux and Windows as well, but unfortunately I don't have the environment.
Did all of these benchmarks run in release mode? I still would love to see how it performs on Linux and Windows, but overall this is definitely something I'd consider merging 👍
Yeah, I just ran in with -O
but could consider experimenting with -Osize
if you guys think is worth it :)
It should be possible since this tool work in Linux and Windows as well, but unfortunately I don't have the environment.
I wanted to set up some reproducible benchmarks that measure relative changes in performance between main
and PR branches across all platforms. I'm looking forward to running that comparison for this PR when that's ready, but it may take some time.
Yeah, I just ran in with
-O
but could consider experimenting with-Osize
if you guys think is worth it :)
Great, I assume -O
is equivalent to swift build -c release
?
Great, I assume
-O
is equivalent toswift build -c release
?
I compiled a single standalone file, so using swiftc
directly, but as far as I understand swift build
using release
should invoke the compiler with -O
I've made some benchmarks using swift-collection-benchmark
. Here's the first set of results for 100% whitespace strings up to 16M elements.
As you can see, trimming
has a lower floor in this case, possibly due to its raw iteration over the String
's buffer instead of going through an actual Swift Iterator
.
I'll work on generating cases with different combinations of whitespace.
Don't really have any other interesting results, but I've added another run that's 50% leading whitespace. Similar curve, but the allSatisfy
/ trimmingCharacters
crossover happens later, possible due to more work materializing the output string for comparison.
Thank you for the benchmarks @jshier, my initial guess was exactly that avoid materialization of string for compare would be a perf win, but I guess the benchmarks show that it is not true. Have you tested cases with white space in the back, I think this would be a win for allSatisfy
because it can bail early while trimming
has to look both sides.
And again, thanks for the benchmarks :)
Back half is the same as just having no whitespace at all: both implementations early out, so there's a fixed difference between them, just like the early part of these graphs where the setup cost of trimmingCharacters
hasn't yet paid off in faster iteration.
Thanks for the detailed charts @jshier! What tools did you use to generate them and to collect benchmark data? I was going to use https://github.com/google/swift-benchmark for CI benchmarking, but I see you probably use something fancier?
Back half is the same as just having no whitespace at all: both implementations early out, so there's a fixed difference between them, just like the early part of these graphs where the setup cost of
trimmingCharacters
hasn't yet paid off in faster iteration.
That is unexpected, because for a string with no whitespaces in front I would expect allSafisfy
to be noop, because it can bail on first char, while trimming has to loop and trim char in the back. I run the bench locally and it does show the expected results
Here are the generators:
// 50%
let inputGenerator: (Int) -> String = { size in
return String(repeating: "a", count: size / 2) + String(repeating: " ", count: size / 2)
}
// 25% 50%(non-whitespaces) 25%
let inputGenerator: (Int) -> String = { size in
return String(repeating: " ", count: size / 4) + String(repeating: "a", count: size / 2) + String(repeating: " ", count: size / 4)
}
I was going to use https://github.com/google/swift-benchmark for CI benchmarking, but I see you probably use something fancier?
@MaxDesiatov That would be https://github.com/apple/swift-collections-benchmark, it does produce the results in a json format as well, so it probably can be used on a CI setting too :)
Yes, I used swift-collection-benchmark
as @LucianoPAlmeida mentioned.
@LucianoPAlmeida you're right, the backside benchmarks are more interesting as you show. In that case trimmingCharacters
doesn't get to early out like allSatisfy
does. I'll put together some final results but it seems likely the smaller string performance wins for allSatisfy
may be the most important here, but I haven't checked where isAllWhitespace()
is actually called.
Here are some final results. Unfortunately the label is a bit in the way but you can get the overall idea.
Here's my bencher if you want to play with it.
Thank you @jshier for the benchmarks, really good overview the last one! In my opinion for my understanding of the results, overall it seems like a nice improvement because where it loses the difference is almost negligible but when it can bail early the difference is quite good. Let me know your thoughts about it and if that's feels like a good change for the project :)
LGTM, thank you both for the detailed investigation!
This patch is really very simple one, but I believe can be both a simplification on implementation and a perf improvement in the sense that as it is if we take for example as reference the implementation of
trimmingCharacters
we note that it will do a lot extra work as allocating buffers or a whole new string maybe, when this only need a simple check that can exit early and also avoid those extra allocations.I'm not sure if is significant, because I cannot tell if this is used in hot paths within the project and also I didn't measure, so perf improvement is just a guess that makes sense to me, and since it was a simple change I thought in open the PR. So let me know if this makes sense :)