Odd performance characteristics when using `lineRange` on `Substring`

BitShiftNow commented 1 year ago

Description I have created a small benchmark with swift-collections-benchmark to measure the performance of retrieving the half-open ranges of all lines in a string. Each line in the string contained about 15 ASCII characters on average (String encoded as utf8). Some lines were empty (only a single newline character). The string itself had about 2 million lines in total, however the benchmark was only retrieving up to 1 million lines in total. I was measuring the performance difference of the lineRange method when used on a String and Substring input.

When using lineRange on a String input it takes a little over 200ms to retrieve all ranges for 1 million lines of text. When using lineRange on a Substring however it takes over 1 second to retrieve all ranges for well under 64k lines of text.

Expected behavior I expected the performance characteristics of lineRange to be similar to other StringProtocol methods such as split (before is String - after is Substring. Please forgive the confusing colour swap)

OR enumerateLines (same labels as before)

Steps to reproduce I can not share the input file used in the tests as it was proprietary 8-bit assembly code but such an input file should be trivial to create. The method to find line ranges in the string used within the benchmark looked like this:

extension StringProtocol {
    func lineRangeBenchmark() -> [Range<Self.Index>] {
        var output: [Range<Self.Index>] = []

        var startIndex = self.startIndex

        repeat {
            let range = self.lineRange(for: startIndex..<startIndex)
            output.append(range)
            startIndex = range.upperBound
        } while startIndex < self.endIndex

        return output
    }
}

The benchmark itself was set up like this:

var benchmark = Benchmark(title: "LineRanges")

benchmark.registerInputGenerator(for: Substring.self) { count in
    let ranges = input.split(omittingEmptySubsequences: false, whereSeparator: \.isNewline)
    let index = ranges.index(ranges.startIndex, offsetBy: count + 1)
    return input[input.startIndex..<ranges[index].endIndex] // input is 2 million lines. count denotes the number of lines requested
}

benchmark.registerInputGenerator(for: String.self) { count in
    let ranges = input.split(omittingEmptySubsequences: false, whereSeparator: \.isNewline)
    let index = ranges.index(ranges.startIndex, offsetBy: count + 1)
    return String(input[input.startIndex..<ranges[index].endIndex]) // input is 2 million lines. count denotes the number of lines requested
}

benchmark.addSimple(title: "Substring LineRange", input: Substring.self) { i in
    blackHole(i.lineRangeBenchmark())
}

benchmark.addSimple(title: "String LineRange", input: String.self) { i in
    blackHole(i.lineRangeBenchmark())
}

benchmark.main()

I ran the benchmark via swift run -c release LineRangesBenchmark run --cycles 10 results and created the graph with swift run -c release LineRangesBenchmark render results --amortized false --linear-time --linear-size chart.png

Verbose output seems to indicate that -c release sets -O and whole-module-optimization.

Environment My test machine was a 2023 MacMini with M2 Pro Processor running MacOS Ventura 13.2.

% swift --version
swift-driver version: 1.62.15 Apple Swift version 5.7.2 (swiftlang-5.7.2.135.5 clang-1400.0.29.51)
Target: arm64-apple-macosx13.0

LucianoPAlmeida commented 1 year ago

lineRange is a Foundation API so is hard to know the implementation details, but there are a few more tests we can do to have more info on this:

Try make lineRangeBenchmark to be @inlinable and see if results are the same.
Attempt to see what kind of code we get generated from calling this function with String and Substring. Maybe that can hint at something? Does generic specialization could make impacts in this case? Nothing but a wild guess.

BitShiftNow commented 1 year ago

@inlinable didn't make any difference.

About seeing what code gets generated: How do I get the assembly output from the swift compiler? Or is there some IL which I should get the output of? I have never done this in swift so not quite sure how to get the generated code.

I have attached the project in case anyone wants to try this out. If you want to run the benchmarks just fill in the input.asm file with 2mil lines of text to have a similar input I used.

Archive.zip

LucianoPAlmeida commented 1 year ago

How do I get the assembly output from the swift compiler? Or is there some IL which I should get the output of? I have never done this in swift so not quite sure how to get the generated code.

There is -Xfrontend -emit-irgen to output llvm IR and -Xfrontend -emit-assembly to output code for the target you are compiling for.

BitShiftNow commented 1 year ago

-Xfrontend didn't seem to work but -Xswiftc did the trick. The assembly and irgen seem to only omit as terminal output and not as a file so I redirected the output to text files instead. Hope that helps.

irgen.txt.zip assembly.txt.zip

AnthonyLatsis commented 1 year ago

cc @parkera

BitShiftNow commented 1 year ago

Just wondering: is there an update on this issue?

swiftlang / swift

Odd performance characteristics when using `lineRange` on `Substring` #63600