Open Lukasa opened 4 years ago
@swift-ci create
How does this fare?
func extendWithLoop(_ arr: inout [UInt8], count: Int) {
arr.reserveCapacity(count)
for _ in 0..<count {
arr.append(0)
}
}
No better. It guarantees the branch won't be taken, but LLVM can't tell and so the branch is still emitted, which prevents the vectorisation opportunity.
Additionally, reserveCapacity
's documentation pretty strongly warns against doing this:
If you implement a custom data structure backed by an array that grows dynamically, naively calling the
reserveCapacity()
method can lead to worse than expected performance. Arrays need to follow a geometric allocation pattern for appending elements to achieve amortized constant-time performance. The Array type’sappend()
andappend(contentsOf)
methods take care of this detail for you, butreserveCapacity()
allocates only as much space as you tell it to (padded to a round value), and no more. This avoids over-allocation, but can result in insertion not having amortized constant-time performance.
I'm in the middle of redesigning Array's growth strategy so I can pretty safely say that the "naive" use it's warning about is calling it inside the loop. Calling it once up front is how it's supposed to be used.
That said, point taken about vectorization.
Environment
Apple Swift version 5.2 (swiftlang-1103.0.24.3 clang-1103.0.30.1) Target: x86_64-apple-darwin19.4.0Additional Detail from JIRA
| | | |------------------|-----------------| |Votes | 1 | |Component/s | Standard Library | |Labels | Bug | |Assignee | None | |Priority | Medium | md5: dd5e30591fa50bc140e23288fa911137Issue Description:
Related to SR-12221.
Sometimes I have situations where I want to zero-extend an Array. This is usually done to bring a whole bunch of indices into existence and make them valid at once. One case where I might do that is when I want to express a safe memmove (hence the relationship to SR-12221), which requires that all indices used be valid.
I am happy enough to pay the price of initialising this memory in this case, as it's usually not too much memory I have to initialise. Right now, there are three obvious ways to do this on Array (using
Array<UInt8>
for clarity), shown below and in a compiler explorer page:All three of these generate suboptimal code.
The first two generate monstrously large chunks of code.
repeatElement
is the baseline: it generates an enormous amount of code, which would be acceptable if it was vectorised, but it degrades to a byte-by-byte copy: the compiler is not able to constant-propagate the zero byte through and observe that there is no need to do this byte wise. The Array based one avoids the byte wise operation, but has to initialise each byte twice: once withbzero
to create the new array, and then again when those bytes are {{memcpy}}d into the target.Option 3 is in most ways the best. It generates much smaller code, and the compiler would almost be able vectorise it, except that the quadratic growth of
Array
stymies it. Each byte must be checked to confirm that theArray
has space, otherwise a resizing operation occurs. This also slows this down, and in cases where the array started out empty we'll get a fairly large number of resizing operations.It's very unfortunate that there is no good way to rapidly extend a collection with a single element. We can construct new ones this way: just not extend old ones. It would be very helpful to have some way for a type to express that it can safely be rapidly extended like this.