swiftlang / swift

The Swift Programming Language
https://swift.org
Apache License 2.0
67.66k stars 10.38k forks source link

[SR-11342] Missing integer arithmetic primitives in stdlib #53743

Open stephentyrone opened 5 years ago

stephentyrone commented 5 years ago
Previous ID SR-11342
Radar rdar://problem/54557230
Original Reporter @stephentyrone
Type Bug
Additional Detail from JIRA | | | |------------------|-----------------| |Votes | 0 | |Component/s | Standard Library | |Labels | Bug | |Assignee | @stephentyrone | |Priority | Medium | md5: 4f805065fc2595a77d89ad880cd55a1b

Issue Description:

When building larger-than-word fixed-width integer arithmetic, the following operations are performance critical: full add, full subtract and multiply with 0, 1, or 2 addends. For optimized builds, if you write them just right, we usually get pretty correct codegen out, but our debug story is terrible. We should expose these primitives in the standard library where they can be mapped to builtins to address this problem.

Sample Swift implementations:

func add(_ a: inout UInt, _ b: UInt, carry c: Bool = false) -> Bool {
  a = a &+ b &+ (c ? 1 : 0)
  return a < b || a == b && c
}

func muladd(_ a: UInt, _ b: UInt, _ c: UInt = 0, _ d: UInt = 0) -> (lo: UInt, hi: UInt) {
  var (hi, lo) = a.multipliedFullWidth(by: b)
  hi &+= add(&lo, c) ? 1 : 0
  hi &+= add(&lo, d) ? 1 : 0
  return (lo, hi)
}

LLVM heroics allow us to pattern-match these and get good codegen when optimization is enabled; here's a two-addend multiply, for example:

push  %rbp
movq  %rsp, %rbp
movq  %rdx, %r8
movq  %rsi, %rax
mulq  %rdi        // 1. a*b -> (rax:rdx)
xorl  %esi, %esi
addq  %r8,  %rax  // 2. rax += c
setb  %sil        // 3. carry out
addq  %rcx, %rax  // 4. rax += d
adcq  %rsi, %rdx  // 5. rdx += both carries
popq  %rbp
retq

This is essentially optimal, up to some unimportant details. However, the debug codgen is terrible, because none of the necessary optimizations take place. We should provide wrappers around the necessary builtins in the stdlib so that we can provide good codgen for these operations (and not require users to write them themselves, because they'll make mistakes in doing so).

stephentyrone commented 5 years ago

There's a question to be resolved of whether `UInt` is the appropriate type on which to define these. At first glance it is, but for platforms like arm64_32, we'd be somewhat better served if we used a `Word` type that's defined to `UInt64`. It's unclear if performance considerations on such aberrant platforms justify long-term API complexity, however.

stephentyrone commented 5 years ago

Also to be resolved: a full subtract API could expose subtract with carry, or subtract with borrow (which is !carry), or match the underlying hardware. There are reasonable arguments for all three (less so for the last one). It's not hugely important which one is picked, but it's a choice that needs to be made (and we'll need to ensure that LLVM will always do the necessary optimizations to map between them at the API boundary).

stephentyrone commented 5 years ago

@swift-ci create

stephentyrone commented 5 years ago

Note to any interested parties reading this: I do not intend to move especially quickly on this bug, because designing the best API for these primitives is a bit fussy (inout args vs tuple returns, multiple return ordering, in-place vs. out-of-place operation). These "have to" go in the stdlib to be really effective, but that means they need to be stable immediately, which means we need to get the API right. I'll create a branch with a strawman early on, and then sit on it for a while.