swiftlang / swift

The Swift Programming Language
https://swift.org
Apache License 2.0
67.6k stars 10.37k forks source link

[SR-12222] No safe way to performantly zero-extend an Array #54648

Open Lukasa opened 4 years ago

Lukasa commented 4 years ago
Previous ID SR-12222
Radar rdar://problem/59516884
Original Reporter @Lukasa
Type Bug
Environment Apple Swift version 5.2 (swiftlang-1103.0.24.3 clang-1103.0.30.1) Target: x86_64-apple-darwin19.4.0
Additional Detail from JIRA | | | |------------------|-----------------| |Votes | 1 | |Component/s | Standard Library | |Labels | Bug | |Assignee | None | |Priority | Medium | md5: dd5e30591fa50bc140e23288fa911137

Issue Description:

Related to SR-12221.

Sometimes I have situations where I want to zero-extend an Array. This is usually done to bring a whole bunch of indices into existence and make them valid at once. One case where I might do that is when I want to express a safe memmove (hence the relationship to SR-12221), which requires that all indices used be valid.

I am happy enough to pay the price of initialising this memory in this case, as it's usually not too much memory I have to initialise. Right now, there are three obvious ways to do this on Array (using Array<UInt8> for clarity), shown below and in a compiler explorer page:

func extendWithArray(_ arr: inout [UInt8], count: Int) {
    arr.append(contentsOf: Array(repeating: 0, count: count))
}

func extendWithRepeatElement(_ arr: inout [UInt8], count: Int) {
    arr.append(contentsOf: repeatElement(0, count: count))
}

func extendWithLoop(_ arr: inout [UInt8], count: Int) {
    for _ in 0..<count {
        arr.append(0)
    }
}

All three of these generate suboptimal code.

The first two generate monstrously large chunks of code. repeatElement is the baseline: it generates an enormous amount of code, which would be acceptable if it was vectorised, but it degrades to a byte-by-byte copy: the compiler is not able to constant-propagate the zero byte through and observe that there is no need to do this byte wise. The Array based one avoids the byte wise operation, but has to initialise each byte twice: once with bzero to create the new array, and then again when those bytes are {{memcpy}}d into the target.

Option 3 is in most ways the best. It generates much smaller code, and the compiler would almost be able vectorise it, except that the quadratic growth of Array stymies it. Each byte must be checked to confirm that the Array has space, otherwise a resizing operation occurs. This also slows this down, and in cases where the array started out empty we'll get a fairly large number of resizing operations.

It's very unfortunate that there is no good way to rapidly extend a collection with a single element. We can construct new ones this way: just not extend old ones. It would be very helpful to have some way for a type to express that it can safely be rapidly extended like this.

weissi commented 4 years ago

@swift-ci create

Catfish-Man commented 4 years ago

How does this fare?

func extendWithLoop(_ arr: inout [UInt8], count: Int) {
    arr.reserveCapacity(count)
    for _ in 0..<count {
        arr.append(0)
    }
}
Lukasa commented 4 years ago

No better. It guarantees the branch won't be taken, but LLVM can't tell and so the branch is still emitted, which prevents the vectorisation opportunity.

Additionally, reserveCapacity's documentation pretty strongly warns against doing this:

If you implement a custom data structure backed by an array that grows dynamically, naively calling the reserveCapacity() method can lead to worse than expected performance. Arrays need to follow a geometric allocation pattern for appending elements to achieve amortized constant-time performance. The Array type’s append() and append(contentsOf) methods take care of this detail for you, but reserveCapacity() allocates only as much space as you tell it to (padded to a round value), and no more. This avoids over-allocation, but can result in insertion not having amortized constant-time performance.

Catfish-Man commented 4 years ago

I'm in the middle of redesigning Array's growth strategy so I can pretty safely say that the "naive" use it's warning about is calling it inside the loop. Calling it once up front is how it's supposed to be used.

That said, point taken about vectorization.