Open BitShiftNow opened 1 year ago
lineRange
is a Foundation
API so is hard to know the implementation details, but there are a few more tests we can do to have more info on this:
lineRangeBenchmark
to be @inlinable
and see if results are the same. String
and Substring
. Maybe that can hint at something? Does generic specialization could make impacts in this case? Nothing but a wild guess.@inlinable
didn't make any difference.
About seeing what code gets generated: How do I get the assembly output from the swift compiler? Or is there some IL which I should get the output of? I have never done this in swift so not quite sure how to get the generated code.
I have attached the project in case anyone wants to try this out. If you want to run the benchmarks just fill in the input.asm
file with 2mil lines of text to have a similar input I used.
How do I get the assembly output from the swift compiler? Or is there some IL which I should get the output of? I have never done this in swift so not quite sure how to get the generated code.
There is -Xfrontend -emit-irgen
to output llvm IR and -Xfrontend -emit-assembly
to output code for the target you are compiling for.
-Xfrontend
didn't seem to work but -Xswiftc
did the trick. The assembly and irgen seem to only omit as terminal output and not as a file so I redirected the output to text files instead. Hope that helps.
cc @parkera
Just wondering: is there an update on this issue?
@itingliu Are you able to transfer this to one of the foundation repositories?
We made some enhancement to this function in macOS 14 that includes a fast path for both String
and Substring
. I'd expect this to have been resolved, though I haven't verified it myself.
Description I have created a small benchmark with swift-collections-benchmark to measure the performance of retrieving the half-open ranges of all lines in a string. Each line in the string contained about 15 ASCII characters on average (String encoded as utf8). Some lines were empty (only a single newline character). The string itself had about 2 million lines in total, however the benchmark was only retrieving up to 1 million lines in total. I was measuring the performance difference of the lineRange method when used on a
String
andSubstring
input.When using
lineRange
on aString
input it takes a little over 200ms to retrieve all ranges for 1 million lines of text. When usinglineRange
on aSubstring
however it takes over 1 second to retrieve all ranges for well under 64k lines of text.Expected behavior I expected the performance characteristics of
lineRange
to be similar to otherStringProtocol
methods such assplit
(before
isString
-after
isSubstring
. Please forgive the confusing colour swap)OR
enumerateLines
(same labels as before)Steps to reproduce I can not share the input file used in the tests as it was proprietary 8-bit assembly code but such an input file should be trivial to create. The method to find line ranges in the string used within the benchmark looked like this:
The benchmark itself was set up like this:
I ran the benchmark via
swift run -c release LineRangesBenchmark run --cycles 10 results
and created the graph withswift run -c release LineRangesBenchmark render results --amortized false --linear-time --linear-size chart.png
Verbose output seems to indicate that
-c release
sets-O
andwhole-module-optimization
.Environment My test machine was a 2023 MacMini with M2 Pro Processor running MacOS Ventura 13.2.