Using `ManagedBufferPointer` instead of `Array` as a storage

LiarPrincess commented 2 years ago

Hi,

Recently I had to write my own BigInt implementation for Violet - Python VM written in Swift.

Internally I decided to use ManagedBufferPointer instead of Swift Array. The whole design in one sentence would be: union (via tagged pointer) of Int32 (called Smi, after V8) and a heap allocation (magnitude + sign representation) with ARC for garbage collection. The detailed explanation is available in our documentation.

Naturally I'm quite curious why most of the BigInt libraries (including this one) use Array. The current implementation gives you (2014 rMBP with Intel x64):

print("BigUInt.size:", MemoryLayout<BigUInt>.size) // 32
print("BigUInt.stride:", MemoryLayout<BigUInt>.stride) // 32
print("BigInt.size:", MemoryLayout<BigInt>.size) // 33
print("BigInt.stride:", MemoryLayout<BigInt>.stride) // 40

Going with ManagedBufferPointer would give us much smaller numbers:

// Basically our own version of `Swift.Array` specialized for storing `Words`.
// Mainly deals with COW.
struct BigIntStorage {
  struct Header {
    var count: Int
  }

  typealias Word = UInt
  typealias Buffer = ManagedBufferPointer<Header, Word>
}

struct BigUInt2 {
  typealias Word = BigIntStorage.Word

  enum Kind {
    case inline(Word, Word)
    case slice(from: Int, to: Int)
    case array
  }

  var kind: Kind
  var storage: BigIntStorage // <- This line changed!
}

struct BigInt2 {
  enum Sign {
    case plus
    case minus
  }

  typealias Magnitude = BigUInt2
  typealias Word = BigUInt.Word

  public var magnitude: BigUInt2
  public var sign: Sign
}

print("BigUInt2.size:", MemoryLayout<BigUInt2>.size) // 17
print("BigUInt2.stride:", MemoryLayout<BigUInt2>.stride) // 24
print("BigInt2.size:", MemoryLayout<BigInt2>.size) // 18
print("BigInt2.stride:", MemoryLayout<BigInt2>.stride) // 24

I believe that this approach would have following advantages:

better usage of CPU cache - in the current design BigInt has size 33 and stride 40. With ManagedBufferPointer we have size 18 and stride 24. This does not matter for a BigInt as a type, but it may matter in real-life scenarios, for example when it has to be stored in a struct on an Array. (Just a reminder: intel cpus have 64 bytes cache line and M1 128 bytes, though I do not own the M1 device to check this).
BigIntStorage is specialized for storing Word which means that it can do some things in a more efficient way than Swift.Array.
potential further optimizations - I believe that you could bring the stride to 16, but then: inline value would be a single Word (instead of 2 Words) and the slice from/to would have to be Int32 (instead of Int) + some minor rearrangement of how things are stored internally. It may not be worth it though.

The downside is that you would have to implement your own heap storage based on ManagedBufferPointer, but this is not that difficult.

LiarPrincess commented 2 years ago

As for any regressions: I also propose #98 Using tests from “Violet - Python VM written in Swift”. So, first I would add test cases and them we could (maybe) talk about ManagedBufferPointer.

tgymnich commented 2 years ago

This sounds great. Did you already benchmark both approaches?

LiarPrincess commented 2 years ago

This is a little bit more complicated. There is no silver bullet and there are multiple ways in which you can implement a BigInt depending on what use-cases you target.

Before I implement this change I want to close the #98 Using tests from “Violet - Python VM written in Swift”.

The improvements (if any) would be only in some specific scenarios, definitely not in the most common case then the test looks like this:

let a: BigInt = …
let b: BigInt = …
do something with them, maybe even I a loop…

Stride only matters in continuous storage, like arrays and structs. In Violet having a stride 8 (single pointer) means that we can fit more BigInts in a single cache line which matters in some scenarios.

In addition, things work well in Violet because we only have 2 representations:

smi - Int32 inside pointer
heap - heap allocated if the value is outside of Int32 range

In 99% of the cases we are smi which is nice for branch predictor in some very tight loops. This may not be the case for 'attaswift/BigInt' which has 3 representations.

Anyway, let's finish the #98 first and then (maybe) go back to this issue.

attaswift / BigInt

Using `ManagedBufferPointer` instead of `Array` as a storage #97